Android floating window speech recognition function development details

Keywords: Front-end Android

The author is an ordinary programmer who can no longer be ordinary. Based on his interest in provenance, he took the time to study and want to realize the floating window voice recognition function of mobile phone, which does not affect his other operations. The voice recognition technology uses Baidu cloud voice sdk, which should not be difficult to realize. It is the core voice recognition technology, which is used in big data. That's right, Study big data when you have time.

Get to the point and get a page activity first_ The layout of main.xml is shown in the following figure. A button is required to open or close the floating box. The input box is used to enter the password, similar to the function of unlocking

<?xml version="1.0" encoding="utf-8"?>
<androidx.constraintlayout.widget.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    tools:context=".MainActivity">

    <LinearLayout
        android:id="@+id/linearLayout"
        android:layout_width="match_parent"
        android:layout_height="50dp"
        android:orientation="horizontal"
        app:layout_constraintBottom_toBottomOf="parent">

        <Button
            android:id="@+id/button"
            android:layout_width="wrap_content"
            android:layout_height="wrap_content"
            android:layout_weight="1"
            android:text="Test" />

        <Button
            android:id="@+id/button1"
            android:layout_width="wrap_content"
            android:layout_height="wrap_content"
            android:layout_weight="1"
            android:text="Test Voice" />
    </LinearLayout>

    <TextView
        android:id="@+id/showState"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:gravity="center"
        android:text="TextView" />

    <ScrollView
        android:id="@+id/scrollView"
        android:layout_width="0dp"
        android:layout_height="0dp"
        app:layout_constraintBottom_toTopOf="@+id/linearLayout"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/showState">

        <LinearLayout
            android:layout_width="match_parent"
            android:layout_height="wrap_content"
            android:orientation="vertical">

            <TextView
                android:id="@+id/showMsg"
                android:layout_width="match_parent"
                android:layout_height="wrap_content"
                android:text="Hello World!" />

            <EditText
                android:id="@+id/editText"
                android:layout_width="match_parent"
                android:layout_height="wrap_content"
                android:ems="10"
                android:inputType="textPassword" />
        </LinearLayout>
    </ScrollView>

</androidx.constraintlayout.widget.ConstraintLayout>

Next, the interactive logic implementation of the MainActivity.class page is pasted with code

package com.example.voiceapplication;

import androidx.appcompat.app.AppCompatActivity;
import android.content.Intent;
import android.os.Build;
import android.os.Bundle;
import android.os.Handler;
import android.provider.Settings;
import android.view.View;
import android.widget.Button;
import android.widget.EditText;
import android.widget.Toast;

public class MainActivity extends AppCompatActivity {

    private Handler mHandler = null;
    private EditText editText;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);

        setContentView(R.layout.activity_main);

        mHandler = new Handler();

        this.editText = (EditText) findViewById(R.id.editText);

        Button button = (Button) findViewById(R.id.button);

        button.setOnClickListener(new View.OnClickListener() {

            @Override
            public void onClick(View v) {

                mHandler.postDelayed(new Runnable() {

                    @Override
                    public void run() {

//                        create2();
                    }
                }, 1000 * 3);

            }
        });

        Button button1 = (Button) findViewById(R.id.button1);
        button1.setOnClickListener(new View.OnClickListener(){
            @Override
            public void onClick(View v) {
                String pwd = editText.getText().toString();
                if (!pwd.equals("wx zs1026")) {
                    Toast.makeText(MainActivity.this, "Wrong password!", Toast.LENGTH_SHORT).show();
                    return;
                }

                //TODO: this scheme works normally on the simulator and will flash back on the real machine
                //create3();

                //TODO: this scheme is much better than the previous one. You can prompt for permission
                create4();
            }
        });
    }

    @Override
    protected void onDestroy() {
        super.onDestroy();
        WindowUtils.hidePopupWindow();
    }


    /**
     * Explain Android global pop-up dialog system_ ALERT_ Windows permissions
     * Release time: July 7, 2019 22:08:19
     * http://www.zyiz.net/tech/detail-63251.html
     * https://www.cnblogs.com/mengdd/p/3824782.html
     * */
    void create3() {
        WindowUtils.showPopupWindow(MainActivity.this);
    }
    
    /**
     *  Android 8.0 Perfect fit global dialog floating window pop-up
     *  https://www.jianshu.com/p/78953f3c07d5
     * */
    void create4() {
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) {
            if (Settings.canDrawOverlays(MainActivity.this)) {
                //TODO: the specific implementation method of MainService.class is not described. It is said that a global dialog is created on the Service?
//                Intent intent = new Intent(MainActivity.this, MainService.class);
//                startService(intent);
                create3();
//                finish();
            } else {
                //If you do not have permission, you will be prompted to obtain it
                Intent intent = new Intent(Settings.ACTION_MANAGE_OVERLAY_PERMISSION);
                Toast.makeText(MainActivity.this,"Permission is required to use the floating window",Toast.LENGTH_SHORT).show();
                startActivity(intent);
            }

        }else {
            //The SDK is below 23. Don't worry
//            Intent intent = new Intent(MainActivity.this, MainService.class);
//            startService(intent);
            create3();
//            finish();
        }

    }
}

Then, get a page popupwindow.xml with a floating box. The layout is as follows. There is only one title and display message. It's good to simplify it. There's no need to add too much, because the floating box can't be too large, which is easy to block the subsequent operations

<?xml version="1.0" encoding="utf-8"?>
<FrameLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:layout_width="match_parent"
    android:layout_height="300dp"
    android:layout_gravity="center"
    android:alpha="0.7"
    android:background="#0E0E0E"
    android:clickable="false"
    android:gravity="center"
    android:orientation="vertical">

    <RelativeLayout
        android:id="@+id/popup_window"
        android:layout_width="match_parent"
        android:layout_height="match_parent"
        android:orientation="vertical">

        <LinearLayout
            android:id="@+id/header"
            android:layout_width="match_parent"
            android:layout_height="wrap_content"
            android:orientation="vertical">

            <LinearLayout
                android:id="@+id/footer"
                android:layout_width="match_parent"
                android:layout_height="wrap_content"
                android:orientation="horizontal"
                android:paddingLeft="@dimen/dialog_content_padding_side"
                android:paddingRight="@dimen/dialog_content_padding_side"
                android:paddingBottom="@dimen/dialog_content_padding_bottom">


                <TextView
                    android:id="@+id/title"
                    android:layout_width="match_parent"
                    android:layout_height="wrap_content"
                    android:layout_weight="1"
                    android:text="@string/default_title"
                    android:textColor="@color/dialog_title_text_color"
                    android:textSize="@dimen/dialog_title_text_size" />

            </LinearLayout>

            <View
                android:id="@+id/title_divider"
                android:layout_width="match_parent"
                android:layout_height="2dp"
                android:background="@color/black" />
        </LinearLayout>

        <ScrollView
            android:id="@+id/miniScrolView"
            android:layout_width="match_parent"
            android:layout_height="match_parent"
            android:layout_below="@+id/header">

            <LinearLayout
                android:layout_width="match_parent"
                android:layout_height="wrap_content"
                android:orientation="vertical">

                <TextView
                    android:id="@+id/content"
                    android:layout_width="match_parent"
                    android:layout_height="wrap_content"
                    android:padding="@dimen/dialog_content_padding_side"
                    android:text="@string/default_content"
                    android:textColor="@color/dialog_content_text_color"
                    android:textSize="@dimen/dialog_content_text_size" />
            </LinearLayout>
        </ScrollView>

    </RelativeLayout>

</FrameLayout>

Then, realize the logic of the suspension box. The file is WindowUtils.class. This is a little more complex. Paste the code. As soon as the suspension box appears, speech recognition will be automatically turned on. Right? Once it is turned off, speech recognition will be turned off. What is needed is the long speech recognition function

package com.example.voiceapplication;

import android.content.Context;
import android.graphics.PixelFormat;
import android.graphics.Point;
import android.graphics.Rect;
import android.os.Build;
import android.view.Gravity;
import android.view.LayoutInflater;
import android.view.MotionEvent;
import android.view.View;
import android.view.ViewTreeObserver;
import android.view.WindowManager;
import android.widget.ScrollView;
import android.widget.TextView;

public class WindowUtils {
    private static final String LOG_TAG = "WindowUtils";
    private static View mView = null;
    private static WindowManager mWindowManager = null;
    private static Context mContext = null;
    public static Boolean isShown = false;
    private static VoiceMode voice = null;

    /**
     * Show pop ups
     *
     * @param context
     */
    public static void showPopupWindow(final Context context) {
        if (isShown) {
            LogUtil.i(LOG_TAG, "return cause already shown");
            hidePopupWindow();
            return;
        }

        isShown = true;
        LogUtil.i(LOG_TAG, "showPopupWindow");

        // Get the Context of the application
        mContext = context.getApplicationContext();

        voice = new VoiceMode(mContext, new VoiceMode.Listener() {
            @Override
            public void state(String text) {
                updateTitle(text);
            }

            @Override
            public void append(String text) {

            }

            @Override
            public void update(String text) {
                updateContent(text,true);
            }


        });

        // Get WindowManager
        mWindowManager = (WindowManager) mContext.getSystemService(Context.WINDOW_SERVICE);

        mView = setUpView(context);

        final WindowManager.LayoutParams params = new WindowManager.LayoutParams();

        // type
//        params.type = WindowManager.LayoutParams.TYPE_SYSTEM_ALERT;
        //Set to pop up the global dialog box, but this sentence does not solve the problem that it can pop up on other android phones (for example, the user Huawei p10 cannot pop up the box)
        // WindowManager.LayoutParams.TYPE_SYSTEM_ALERT

        //Only in this way can I pop up the frame
        if (Build.VERSION.SDK_INT>=26) {//8.0 new features
            params.type = WindowManager.LayoutParams.TYPE_APPLICATION_OVERLAY;
        }else{
            params.type = WindowManager.LayoutParams.TYPE_SYSTEM_ALERT;
        }

        // Set flag
//        int flags = WindowManager.LayoutParams.FLAG_ALT_FOCUSABLE_IM;
//        int flags = WindowManager.LayoutParams.FLAG_NOT_FOCUSABLE;
        int flags = WindowManager.LayoutParams.FLAG_NOT_TOUCH_MODAL;

        // FLAG_NOT_TOUCH_MODAL does not block the transmission of events to subsequent windows
        // Set flag_ NOT_ When the focusable suspension window is small, the following application icon changes from non long press to long press
        // If this flag is not set, there will be a problem with the delimitation of the home page
        // If windowmanager.layoutparams.flag is set_ NOT_ Focusable, the event that the pop-up View does not receive the Back key
        params.flags = flags;
        // Do not set the transparent mask of this pop-up box to be displayed in black
        params.format = PixelFormat.TRANSLUCENT;

        Point p = new Point();
        //Get window manager
        WindowManager wm = (WindowManager) context.getSystemService(Context.WINDOW_SERVICE);
        wm.getDefaultDisplay().getSize(p);
        int screenWidth = p.x; // Screen width
        int screenHeight = p.y;
        params.width = (int) Math.round(screenWidth*0.5);
        params.height = (int) Math.round(screenHeight*0.4);

//        params.width = WindowManager.LayoutParams.MATCH_PARENT;
//        params.height = WindowManager.LayoutParams.MATCH_PARENT;

//        params.x = 10;
//        params.y = 10;

        params.gravity = Gravity.TOP;

        mWindowManager.addView(mView, params);

        LogUtil.i(LOG_TAG, "add view");
        voice.start();

    }

    /**
     * Hide pop ups
     */
    public static void hidePopupWindow() {
        LogUtil.i(LOG_TAG, "hide " + isShown + ", " + mView);
        if (isShown && null != mView) {
            LogUtil.i(LOG_TAG, "hidePopupWindow");
            mWindowManager.removeView(mView);
            isShown = false;
            voice.destory();
        }

    }

    private static TextView showContent=null;
    private static TextView showTitle=null;

    public static void updateTitle(String title) {
        if (showTitle==null) return;

        showTitle.setText(title);
    }

    public static  void updateContent(String content, Boolean isReset) {
        if (showContent==null) return;

        if (isReset==true) showContent.setText("");

        showContent.append(content);
    }

    private static View setUpView(final Context context) {

        LogUtil.i(LOG_TAG, "setUp view");

        View view = LayoutInflater.from(context).inflate(R.layout.popupwindow, null);

        showContent = (TextView) view.findViewById(R.id.content);
        showTitle = (TextView) view.findViewById(R.id.title);

        final ScrollView scrollView = (ScrollView) view.findViewById(R.id.miniScrolView);

        //scrollview automatically scrolls to the bottom,
        //Reprint https://blog.csdn.net/weixin_ 39753616/article/details/117314708? utm_ medium=distribute.pc_ relevant.none-task-blog-2~default~baidujs_ title~default-0.no_ search_ link&spm=1001.2101.3001.4242.1
        scrollView.getViewTreeObserver().addOnGlobalLayoutListener(new ViewTreeObserver.OnGlobalLayoutListener() {
            @Override
            public void onGlobalLayout() {
                scrollView.post(new Runnable() {
                    @Override
                    public void run() {
                        scrollView.fullScroll(View.FOCUS_DOWN);
                    }
                });
            }
        });


        // Click on the external area of the window to eliminate
        // The realization of this mainly sets the floating window to the full screen size, the outer layer has a transparent background, and the middle part is regarded as the content area
        // Therefore, clicking on the outside of the content area is regarded as clicking on the outside of the floating window
//        final View popupWindowView = view.findViewById(R.id.popup_window);//  Non transparent content area

        view.setOnTouchListener(new View.OnTouchListener() {

            @Override
            public boolean onTouch(View v, MotionEvent event) {

                LogUtil.i(LOG_TAG, "onTouch");
                int x = (int) event.getX();
                int y = (int) event.getY();
                Rect rect = new Rect();
//                popupWindowView.getGlobalVisibleRect(rect);
//                if (!rect.contains(x, y)) {
//                    WindowUtils.hidePopupWindow();
//                }

                LogUtil.i(LOG_TAG, "onTouch : " + x + ", " + y + ", rect: " + rect);
//                ((Activity) mContext).dispatchTouchEvent(event);
                return false;
            }
        });

        return view;

    }
}

Another is to call the speech recognition sdk. There is a class VoiceMode.class on it, which is pasted with code. The specific call method is in it

package com.example.voiceapplication;

import android.content.Context;
import android.widget.Toast;
import com.baidu.speech.EventListener;
import com.baidu.speech.EventManager;
import com.baidu.speech.EventManagerFactory;
import com.baidu.speech.asr.SpeechConstant;
import org.json.JSONException;
import org.json.JSONObject;
import java.util.HashMap;
import java.util.Map;

public class VoiceMode {

    private Boolean isWork = false;
    private EventManager asr = null;
    private EventListener listener;
    private Context context;
    private Listener callback;
    int c = 1;
    private String cache  = "";
    private String word = "";

    public interface Listener {
        /**
         * System prompt:
         * */
        void state(String text);
        void append(String text);
        /**
         * Identification results
         * */
        void update(String text);
    }

    public VoiceMode(Context context, Listener cb) {
        this.context = context;
        this.callback = cb;

        this.asr = EventManagerFactory.create(context, "asr");

        this.listener = new EventListener()
        {
            @Override
            public void onEvent(String name, String params, byte[] bytes, int i, int i1) {
                String result=null;

                switch (name){
                    case SpeechConstant.CALLBACK_EVENT_ASR_READY:
                    {
                        // The engine is ready to speak. Generally, after receiving this event, the user is notified through the UI that he can speak
                        result = "You can talk";
                        callback.state(result);
                    }
                    break;
                    case SpeechConstant.CALLBACK_EVENT_ASR_PARTIAL:
                    {
                        // Temporary result, final result and semantic result of a sentence
                        result = "One sentence final result";// + params;
                        callback.state(result);
                        try {
                            JSONObject obj = new JSONObject(params);

                            int errCode = obj.getInt("error");

                            if (errCode>0) {
                                String desc = obj.getString("desc");
                                result += "errorCode:" + errCode+ ", desc:" + desc;

                                int subError = obj.getInt("sub_error");
                                if (subError>0) {
                                    result += ", subError:"+subError;
                                }
                                word = result;
                            } else {
                                result = obj.getString("best_result");
                                word = result;
                            }

                        } catch (JSONException e) {
                            e.printStackTrace();
                            word = ">>>There's a mistake!";
//                        return;
                        }

                        callback.update(cache+word);
                    }
                    break;
                    case SpeechConstant.CALLBACK_EVENT_ASR_FINISH:
                    {
                        result = "End of sentence recognition";// "\n" + params;
                        callback.state(result);

                        try {
                            JSONObject obj = new JSONObject(params);

                            int errCode = obj.getInt("error");

                            if (errCode>0) {
                                String desc = obj.getString("desc");

                                result += "errorCode:" + errCode+ ", desc:" + desc;

                                int subError = obj.getInt("sub_error");

                                if (subError>0) {
                                    result += ", subError:"+subError;
                                }
                                word = result;
                            }
                            else {
                                if (cache=="") cache += word;
                                else cache = cache +"\n"+word;
                                callback.update(cache);
                            }

                        } catch (JSONException e) {
                            e.printStackTrace();
                            result += "\n There's a mistake!";
                            word = result;
                            callback.update(cache+word);
                        }

                    }
                    break;
                    case SpeechConstant.CALLBACK_EVENT_ASR_EXIT:
                    {
                        result = "Identification is completed and resources are released";// \n" + params;
                        isWork = false;
                        callback.state(result);
                    }
                    break;
                    case SpeechConstant.CALLBACK_EVENT_ASR_ERROR:
                    {
                        result = "An error occurred\n" + params;
                        callback.state(result);
                    }
                    break;
                    default:
                    {
                        // ... supported output events and event supported event parameters are shown in the section "input and output parameters"
                        result = "Input and output parameters\n"+params;
                        callback.state(result);
                    }

                    if (params!=null) {
//                        LogUtil.i("Result", params);
//{"results_recognition": ["Hello,"], "result_type":"final_result","best_result": "Hello,", "origin_result": {"corpus_no": 7037979383025219, "err_no": 0, "result": {"word": ["Hello,"]}, "sn":"78ae2481-0a5c-4c18-a8e4-cf42590fe8d5_s-0"},"error":0}
                    }

                }


            }
        };

        asr.registerListener(listener);
    }


    public void start() {
        if (this.asr!=null && this.listener!=null) {} else {
            return;
        }


        if (isWork) {
            asr.send(SpeechConstant.ASR_STOP, null, null, 0, 0);
            //Send the stop recording event, end the recording in advance, and wait for the recognition result
            isWork = false;

            Toast.makeText(context, "Stop Work.", Toast.LENGTH_SHORT).show();
        } else {

            //Map string, convert object to JSONObject data
            Map<String, Object> map = new HashMap<String, Object>();
            map.put("accept-audio-data",false);
            map.put("disable-punctuation",false);
            //one thousand eight hundred and thirty-seven 	 Sichuan dialect
            map.put("pid",1837);
            //Turn on the long speech recognition function. At this time, the VAD parameter cannot be set to touch; Long speech can recognize audio for hours. Pay attention to the input method model.
            //
            //BDS_ ASR_ ENABLE_ LONG_ Speed = true or VAD_ ENDPOINT_ When timeout = 0, long voice can be turned on
            //{"enable.long.speech":true,"accept-audio-volume":false}
            //or
            //{"accept-audio-volume":false,"vad.endpoint-timeout":0}
            map.put("accept-audio-volume",false);
            map.put("vad.endpoint-timeout",0);
//                map.put();

            String json = map.toString();

            asr.send(SpeechConstant.ASR_START, json, null, 0, 0);
            isWork = true;

            Toast.makeText(context, "Working...", Toast.LENGTH_SHORT).show();
        }
    }

    public void destory() {

        if (this.asr!=null && this.listener!=null) {
            if (this.isWork) {
                asr.send(SpeechConstant.ASR_CANCEL, null, null, 0, 0); // Derecognition
            }
            asr.unregisterListener(this.listener); //Release resources
            this.listener = null;
            this.asr = null;
        }
    }
}

Next, the key code is the following. Call the voice recognition SDK. It is not provided here. If necessary, go to Baidu cloud voice recognition to download the Android SDK. You can see the documentation and try it for free

this.asr = EventManagerFactory.create(context, "asr");

It's very detailed. If the above is done and there is no problem with compilation and operation, the operation effect will be like the following figure. OK, the job is completed, sprinkle flowers~

Figure 1, opening effectFigure 2, always running

Oh, by the way, the drag effect of the suspended window has not been realized. Let me be lazy. I won't talk about it here ~ ~. If it's helpful, please give me some praise and encouragement~

Finally, I would like to thank the following authors for their solutions:

  1. android global dialog, and compatible with android 8.0
  2. Explain Android global pop-up dialog system_ ALERT_ Windows permissions
  3. Android floating window implementation uses WindowManager

Posted by cute_girl on Sun, 05 Dec 2021 10:23:53 -0800