On-device GenAI APIs as part of ML Kit help you easily build with Gemini Nano

On-device GenAI APIs as part of ML Kit help you easily build with Gemini Nano

Home » News » On-device GenAI APIs as part of ML Kit help you easily build with Gemini Nano
Table of Contents

Posted by Caren Chang – Developer Relations Engineer, Chengji Yan – Software program Engineer, Taj Darra – Product Supervisor

We’re excited to announce a set of on-device GenAI APIs, as a part of ML Equipment, that will help you combine Gemini Nano in your Android apps.

To start out, we’re releasing 4 new APIs:

    • Summarization: to summarize articles and conversations
    • Proofreading: to shine quick textual content
    • Rewriting: to reword textual content in numerous kinds
    • Picture Description: to supply quick description for photos

Key advantages of GenAI APIs

GenAI APIs are excessive degree APIs that permit for simple integration, just like present ML Equipment APIs. This implies you may anticipate high quality outcomes out of the field with out additional effort for immediate engineering or superb tuning for particular use instances.

GenAI APIs run on-device and thus present the next advantages:

    • Enter, inference, and output information is processed regionally
    • Performance stays the identical with out dependable web connection
    • No extra value incurred for every API name

To stop misuse, we additionally added security safety in varied layers, together with base mannequin coaching, safety-aware LoRA fine-tuning, enter and output classifiers and security evaluations.

How GenAI APIs are constructed

There are 4 major elements that make up every of the GenAI APIs.

  1. Gemini Nano is the bottom mannequin, as the muse shared by all APIs.
  2. Small API-specific LoRA adapter fashions are educated and deployed on prime of the bottom mannequin to additional enhance the standard for every API.
  3. Optimized inference parameters (e.g. immediate, temperature, topK, batch measurement) are tuned for every API to information the mannequin in returning the perfect outcomes.
  4. An analysis pipeline ensures high quality in varied datasets and attributes. This pipeline consists of: LLM raters, statistical metrics and human raters.

Collectively, these elements make up the high-level GenAI APIs that simplify the trouble wanted to combine Gemini Nano in your Android app.

Evaluating high quality of GenAI APIs

For every API, we formulate a benchmark rating primarily based on the analysis pipeline talked about above. This rating is predicated on attributes particular to a activity. For instance, when evaluating the summarization activity, one of many attributes we have a look at is “grounding” (ie: factual consistency of generated abstract with supply content material).

To offer out-of-box high quality for GenAI APIs, we utilized characteristic particular fine-tuning on prime of the Gemini Nano base mannequin. This resulted in a rise for the benchmark rating of every API as proven under:

Use case in English Gemini Nano Base Mannequin ML Equipment GenAI API
Summarization 77.2 92.1
Proofreading 84.3 90.2
Rewriting 79.5 84.1
Picture Description 86.9 92.3

As well as, this can be a fast reference of how the APIs carry out on a Pixel 9 Professional:

Prefix Pace
(enter processing price)
Decode Pace
(output technology price)
Textual content-to-text 510 tokens/second 11 tokens/second
Picture-to-text 510 tokens/second + 0.8 seconds for picture encoding 11 tokens/second

Pattern utilization

That is an instance of implementing the GenAI Summarization API to get a one-bullet abstract of an article:

val articleToSummarize = "We're excited to announce a set of on-device generative AI APIs..."

// Outline activity with desired enter and output format
val summarizerOptions = SummarizerOptions.builder(context)
    .setInputType(InputType.ARTICLE)
    .setOutputType(OutputType.ONE_BULLET)
    .setLanguage(Language.ENGLISH)
    .construct()
val summarizer = Summarization.getClient(summarizerOptions)

droop enjoyable prepareAndStartSummarization(context: Context) {
    // Examine characteristic availability. Standing will probably be one of many following: 
    // UNAVAILABLE, DOWNLOADABLE, DOWNLOADING, AVAILABLE
    val featureStatus = summarizer.checkFeatureStatus().await()

    if (featureStatus == FeatureStatus.DOWNLOADABLE) {
        // Obtain characteristic if mandatory.
        // If downloadFeature just isn't known as, the primary inference request will 
        // additionally set off the characteristic to be downloaded if it is not already
        // downloaded.
        summarizer.downloadFeature(object : DownloadCallback {
            override enjoyable onDownloadStarted(bytesToDownload: Lengthy) { }

            override enjoyable onDownloadFailed(e: GenAiException) { }

            override enjoyable onDownloadProgress(totalBytesDownloaded: Lengthy) {}

            override enjoyable onDownloadCompleted() {
                startSummarizationRequest(articleToSummarize, summarizer)
            }
        })    
    } else if (featureStatus == FeatureStatus.DOWNLOADING) {
        // Inference request will robotically run as soon as characteristic is      
        // downloaded.
        // If Gemini Nano is already downloaded on the system, the   
        // feature-specific LoRA adapter mannequin will probably be downloaded very  
        // rapidly. Nevertheless, if Gemini Nano just isn't already downloaded, 
        // the obtain course of could take longer.
        startSummarizationRequest(articleToSummarize, summarizer)
    } else if (featureStatus == FeatureStatus.AVAILABLE) {
        startSummarizationRequest(articleToSummarize, summarizer)
    } 
}

enjoyable startSummarizationRequest(textual content: String, summarizer: Summarizer) {
    // Create activity request  
    val summarizationRequest = SummarizationRequest.builder(textual content).construct()

    // Begin summarization request with streaming response
    summarizer.runInference(summarizationRequest) { newText -> 
        // Present new textual content in UI
    }

    // You can even get a non-streaming response from the request
    // val summarizationResult = summarizer.runInference(summarizationRequest)
    // val abstract = summarizationResult.get().abstract
}

// You'll want to launch the useful resource when now not wanted
// For instance, on viewModel.onCleared() or exercise.onDestroy()
summarizer.shut()

For extra examples of implementing the GenAI APIs, try the official documentation and samples on GitHub:

Use instances

Right here is a few steerage on easy methods to finest use the present GenAI APIs:

For Summarization, contemplate:

    • Dialog messages or transcripts that contain 2 or extra customers
    • Articles or paperwork lower than 4000 tokens (or about 3000 English phrases). Utilizing the primary few paragraphs for summarization is normally ok to seize a very powerful info.

For Proofreading and Rewriting APIs, contemplate using them throughout the content material creation course of for brief content material under 256 tokens to assist with duties comparable to:

    • Refining messages in a specific tone, comparable to extra formal or extra informal
    • Sharpening private notes for simpler consumption later

For the Picture Description API, contemplate it for:

    • Producing titles of photos
    • Producing metadata for picture search
    • Using descriptions of photos in use instances the place the pictures themselves can’t be displayed, comparable to inside a listing of chat messages
    • Producing various textual content to assist visually impaired customers higher perceive content material as a complete

GenAI API in manufacturing

Envision is an app that verbalizes the visible world to assist people who find themselves blind or have low imaginative and prescient lead extra unbiased lives. A typical use case within the app is for customers to take an image to have a doc learn out loud. Using the GenAI Summarization API, Envision is now capable of get a concise abstract of a captured doc. This considerably enhances the consumer expertise by permitting them to rapidly grasp the details of paperwork and decide if a extra detailed studying is desired, saving them effort and time.

side by side images of a mobile device showing a document on a table on the left, and the results of the scanned document on the right showing details providing the what, when, and where as written in the document

Supported gadgets

GenAI APIs can be found on Android gadgets utilizing optimized MediaTek Dimensity, Qualcomm Snapdragon, and Google Tensor platforms via AICore. For a complete record of gadgets that help GenAI APIs, check with our official documentation.

Study extra

Begin implementing GenAI APIs in your Android apps at present with steerage from our official documentation and samples on GitHub: AI Catalog GenAI API Samples with Compose, ML Equipment GenAI APIs Quickstart.

Supply hyperlink

author avatar
roosho Senior Engineer (Technical Services)
I am Rakib Raihan RooSho, Jack of all IT Trades. You got it right. Good for nothing. I try a lot of things and fail more than that. That's how I learn. Whenever I succeed, I note that in my cookbook. Eventually, that became my blog. 
share this article.

Enjoying my articles?

Sign up to get new content delivered straight to your inbox.

Please enable JavaScript in your browser to complete this form.
Name