一文搞懂ANR

怎甘沉沦 提交于 2020-01-24 20:40:19

1.ANR的定义

ANR(Application Not Responding):应用无响应
即主线程在特定的时间内没有完成特定的事情,就会产生ANR。
在Android当中有以下几种ANR的类型:

  1. KeyDispatchTimeout,input事件在5秒内没有处理完;
  2. ServiceTimeout,前台service在20秒内,后台service在200秒内没有处理完;
  3. BroadcastTimeout,BroadcastReceiver的onReceiver,前台广播在10秒内,后台广播在60秒内没有处理完;
  4. ProcessContentproviderPublishTimeoutLocked,ContentProvider publish在10秒内没有处理完;

2.各种场景产生ANR的原因

我们分别针对这几种场景来看看,系统是如何抛出ANR异常的

2.1 ServiceTimeout

首先我们来看下一个Service启动的流程,如下图所示
在这里插入图片描述
我们来看下ActiveServices.realStartServiceLocked的源码,

    private final void realStartServiceLocked(ServiceRecord r,
            ProcessRecord app, boolean execInFg) throws RemoteException {
        if (app.thread == null) {
            throw new RemoteException();
        }
        if (DEBUG_MU)
            Slog.v(TAG_MU, "realStartServiceLocked, ServiceRecord.uid = " + r.appInfo.uid
                    + ", ProcessRecord.uid = " + app.uid);
        r.app = app;
        r.restartTime = r.lastActivity = SystemClock.uptimeMillis();

        final boolean newService = app.services.add(r);
        //1.启动ANR监测
        bumpServiceExecutingLocked(r, execInFg, "create");
        mAm.updateLruProcessLocked(app, false, null);
        updateServiceForegroundLocked(r.app, /* oomAdj= */ false);
        mAm.updateOomAdjLocked();

        boolean created = false;
        try {
            if (LOG_SERVICE_START_STOP) {
                String nameTerm;
                int lastPeriod = r.shortName.lastIndexOf('.');
                nameTerm = lastPeriod >= 0 ? r.shortName.substring(lastPeriod) : r.shortName;
                EventLogTags.writeAmCreateService(
                        r.userId, System.identityHashCode(r), nameTerm, r.app.uid, r.app.pid);
            }
            synchronized (r.stats.getBatteryStats()) {
                r.stats.startLaunchedLocked();
            }
            mAm.notifyPackageUse(r.serviceInfo.packageName,
                                 PackageManager.NOTIFY_PACKAGE_USE_SERVICE);
            app.forceProcessStateUpTo(ActivityManager.PROCESS_STATE_SERVICE);
            //2.启动service
            app.thread.scheduleCreateService(r, r.serviceInfo,
                    mAm.compatibilityInfoForPackageLocked(r.serviceInfo.applicationInfo),
                    app.repProcState);
            r.postNotification();
            created = true;
        } catch (DeadObjectException e) {
            Slog.w(TAG, "Application dead when creating service " + r);
            mAm.appDiedLocked(app);
            throw e;
        } finally {
            if (!created) {
                // Keep the executeNesting count accurate.
                final boolean inDestroying = mDestroyingServices.contains(r);
                serviceDoneExecutingLocked(r, inDestroying, inDestroying);

                // Cleanup.
                if (newService) {
                    app.services.remove(r);
                    r.app = null;
                }

                // Retry.
                if (!inDestroying) {
                    scheduleServiceRestartLocked(r, false);
                }
            }
        }
        ......
    }

在上面的代码注释2处,启动了service,而在启动service之前,执行了函数bumpServiceExecutingLocked,这个函数会调用scheduleServiceTimeoutLocked,启动ANR的监测机制,具体如下所示

// 前台service timeout的时间
static final int SERVICE_TIMEOUT = 20*1000;

// 后台service Timeout的时间
static final int SERVICE_BACKGROUND_TIMEOUT = SERVICE_TIMEOUT * 10;
    
void scheduleServiceTimeoutLocked(ProcessRecord proc) {
    if (proc.executingServices.size() == 0 || proc.thread == null) {
        return;
    }
    Message msg = mAm.mHandler.obtainMessage(
            ActivityManagerService.SERVICE_TIMEOUT_MSG);
    msg.obj = proc;
    mAm.mHandler.sendMessageDelayed(msg,
            proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
}

可以很容易的看出,scheduleServiceTimeoutLocked其实就是往handler发送了一个延迟msg,如果是后台service,delay的时间就是200秒,如果是前台service,delay的时间就是20秒。
而当服务如果在限制的时间内就执行完成了,那么执行ActivityManagerService.serviceDoneExecuting移除这个msg。

public void serviceDoneExecuting(IBinder token, int type, int startId, int res) {
    synchronized(this) {
        if (!(token instanceof ServiceRecord)) {
            Slog.e(TAG, "serviceDoneExecuting: Invalid service token=" + token);
            throw new IllegalArgumentException("Invalid service token");
        }
        mServices.serviceDoneExecutingLocked((ServiceRecord)token, type, startId, res);
    }
}

2.2 BroadcastTimeout

BroadcastTimeout的ANR原理和Service的差不多,我们来看下BroadcastQueue.processNextBroadcastLocked的源码,

final void processNextBroadcastLocked(boolean fromMsg, boolean skipOomAdj) {
    ......
    
    do {
        if (mOrderedBroadcasts.size() == 0) {
            // No more broadcasts pending, so all done!
            mService.scheduleAppGcsLocked();
            if (looped) {
                // If we had finished the last ordered broadcast, then
                // make sure all processes have correct oom and sched
                // adjustments.
                mService.updateOomAdjLocked();
            }
            return;
        }
        r = mOrderedBroadcasts.get(0);
        boolean forceReceive = false;

        // Ensure that even if something goes awry with the timeout
        // detection, we catch "hung" broadcasts here, discard them,
        // and continue to make progress.
        //
        // This is only done if the system is ready so that PRE_BOOT_COMPLETED
        // receivers don't get executed with timeouts. They're intended for
        // one time heavy lifting after system upgrades and can take
        // significant amounts of time.
        int numReceivers = (r.receivers != null) ? r.receivers.size() : 0;
        if (mService.mProcessesReady && r.dispatchTime > 0) {
            long now = SystemClock.uptimeMillis();
            if ((numReceivers > 0) &&
                    (now > r.dispatchTime + (2*mTimeoutPeriod*numReceivers))) {
                Slog.w(TAG, "Hung broadcast ["
                        + mQueueName + "] discarded after timeout failure:"
                        + " now=" + now
                        + " dispatchTime=" + r.dispatchTime
                        + " startTime=" + r.receiverTime
                        + " intent=" + r.intent
                        + " numReceivers=" + numReceivers
                        + " nextReceiver=" + r.nextReceiver
                        + " state=" + r.state);
                //1.启动ANR监听机制
                broadcastTimeoutLocked(false); // forcibly finish this broadcast
                forceReceive = true;
                r.state = BroadcastRecord.IDLE;
            }
        }

        if (r.state != BroadcastRecord.IDLE) {
            if (DEBUG_BROADCAST) Slog.d(TAG_BROADCAST,
                    "processNextBroadcast("
                    + mQueueName + ") called when not idle (state="
                    + r.state + ")");
            return;
        }

        if (r.receivers == null || r.nextReceiver >= numReceivers
                || r.resultAbort || forceReceive) {
            // No more receivers for this broadcast!  Send the final
            // result if requested...
            if (r.resultTo != null) {
                try {
                    if (DEBUG_BROADCAST) Slog.i(TAG_BROADCAST,
                            "Finishing broadcast [" + mQueueName + "] "
                            + r.intent.getAction() + " app=" + r.callerApp);
                    //2.处理广播消息
                    performReceiveLocked(r.callerApp, r.resultTo,
                        new Intent(r.intent), r.resultCode,
                        r.resultData, r.resultExtras, false, false, r.userId);
                    // Set this to null so that the reference
                    // (local and remote) isn't kept in the mBroadcastHistory.
                    r.resultTo = null;
                } catch (RemoteException e) {
                    r.resultTo = null;
                    Slog.w(TAG, "Failure ["
                            + mQueueName + "] sending broadcast result of "
                            + r.intent, e);

                }
            }

            if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Cancelling BROADCAST_TIMEOUT_MSG");
            cancelBroadcastTimeoutLocked();

            if (DEBUG_BROADCAST_LIGHT) Slog.v(TAG_BROADCAST,
                    "Finished with ordered broadcast " + r);

            // ... and on to the next...
            addBroadcastToHistoryLocked(r);
            if (r.intent.getComponent() == null && r.intent.getPackage() == null
                    && (r.intent.getFlags()&Intent.FLAG_RECEIVER_REGISTERED_ONLY) == 0) {
                // This was an implicit broadcast... let's record it for posterity.
                mService.addBroadcastStatLocked(r.intent.getAction(), r.callerPackage,
                        r.manifestCount, r.manifestSkipCount, r.finishTime-r.dispatchTime);
            }
            mOrderedBroadcasts.remove(0);
            r = null;
            looped = true;
            continue;
        }
    } while (r == null);

    ......
}

我们可以看到在上面的代码注释1处启动了ANR的监听机制,在注释2处处理广播信息。在注释1处,broadcastTimeoutLocked最终会调用broadcastTimeoutLocked,如下面的代码所示,而timeout的时间就等于接收到广播的时间+mTimeoutPeriod。

final void broadcastTimeoutLocked(boolean fromMsg) {
    ......
    if (fromMsg) {
        if (!mService.mProcessesReady) {
            // Only process broadcast timeouts if the system is ready. That way
            // PRE_BOOT_COMPLETED broadcasts can't timeout as they are intended
            // to do heavy lifting for system up.
            return;
        }

        long timeoutTime = r.receiverTime + mTimeoutPeriod;
        if (timeoutTime > now) {
            if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST,
                    "Premature timeout ["
                    + mQueueName + "] @ " + now + ": resetting BROADCAST_TIMEOUT_MSG for "
                    + timeoutTime);
            //发送延迟消息
            setBroadcastTimeoutLocked(timeoutTime);
            return;
        }
    }
    ......
}

final void setBroadcastTimeoutLocked(long timeoutTime) {
    if (! mPendingBroadcastTimeoutMessage) {
        Message msg = mHandler.obtainMessage(BROADCAST_TIMEOUT_MSG, this);
        mHandler.sendMessageAtTime(msg, timeoutTime);
        mPendingBroadcastTimeoutMessage = true;
    }
}

mTimePeriod是在初始化的时候传入的,如下所示。可以很清楚的看到前台BroadcastReceiver的Timeout的时间是10秒,后台BroadcastReceiver的时间是60秒

//BroadcastQueue代码
BroadcastQueue(ActivityManagerService service, Handler handler,
        String name, long timeoutPeriod, boolean allowDelayBehindServices) {
    mService = service;
    mHandler = new BroadcastHandler(handler.getLooper());
    mQueueName = name;
    mTimeoutPeriod = timeoutPeriod;
    mDelayBehindServices = allowDelayBehindServices;
}

//ActivityManagerService代码
static final int BROADCAST_FG_TIMEOUT = 10*1000;
static final int BROADCAST_BG_TIMEOUT = 60*1000;

mFgBroadcastQueue = new BroadcastQueue(this, mHandler,
        "foreground", BROADCAST_FG_TIMEOUT, false);
mBgBroadcastQueue = new BroadcastQueue(this, mHandler,
        "background", BROADCAST_BG_TIMEOUT, true);

2.3 ProcessContentproviderPublishTimeoutLocked

在APP进程启动的时,ActivityThread的main函数会执行attach,而attach函数会调用ActivityManagerService.attachApplication,进而执行attachApplicationLocked函数,我们可以看到在注释1处发送了延迟msg,启动了anr监听机制,而CONTENT_PROVIDER_PUBLISH_TIMEOUT的时间是10秒。


static final int CONTENT_PROVIDER_PUBLISH_TIMEOUT = 10*1000;

private final boolean attachApplicationLocked(IApplicationThread thread,
        int pid, int callingUid, long startSeq) {

    ......

    mHandler.removeMessages(PROC_START_TIMEOUT_MSG, app);

    boolean normalMode = mProcessesReady || isAllowedWhileBooting(app.info);
    List<ProviderInfo> providers = normalMode ? generateApplicationProvidersLocked(app) : null;

    if (providers != null && checkAppInLaunchingProvidersLocked(app)) {
        Message msg = mHandler.obtainMessage(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG);
        msg.obj = app;
        //1.启动anr监听机制
        mHandler.sendMessageDelayed(msg, CONTENT_PROVIDER_PUBLISH_TIMEOUT);
    }

    checkTime(startTime, "attachApplicationLocked: before bindApplication");

    ......
}

而当ContentProvider成功发布之后,ActivityManagerService会remove掉这个msg,如下面代码注释1处所示

public final void publishContentProviders(IApplicationThread caller,
            List<ContentProviderHolder> providers) {
        if (providers == null) {
            return;
        }

        enforceNotIsolatedCaller("publishContentProviders");
        synchronized (this) {
            ......
            for (int i = 0; i < N; i++) {
                ContentProviderHolder src = providers.get(i);
                if (src == null || src.info == null || src.provider == null) {
                    continue;
                }
                ContentProviderRecord dst = r.pubProviders.get(src.info.name);
                if (DEBUG_MU) Slog.v(TAG_MU, "ContentProviderRecord uid = " + dst.uid);
                if (dst != null) {
                    ComponentName comp = new ComponentName(dst.info.packageName, dst.info.name);
                    mProviderMap.putProviderByClass(comp, dst);
                    String names[] = dst.info.authority.split(";");
                    for (int j = 0; j < names.length; j++) {
                        mProviderMap.putProviderByName(names[j], dst);
                    }

                    int launchingCount = mLaunchingProviders.size();
                    int j;
                    boolean wasInLaunchingProviders = false;
                    for (j = 0; j < launchingCount; j++) {
                        if (mLaunchingProviders.get(j) == dst) {
                            mLaunchingProviders.remove(j);
                            wasInLaunchingProviders = true;
                            j--;
                            launchingCount--;
                        }
                    }
                    if (wasInLaunchingProviders) {
                        //1.移除Timeout的msg
                        mHandler.removeMessages(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG, r);
                    }
                    synchronized (dst) {
                        dst.provider = src.provider;
                        dst.proc = r;
                        dst.notifyAll();
                    }
                    updateOomAdjLocked(r, true);
                    maybeUpdateProviderUsageStatsLocked(r, src.info.packageName,
                            src.info.authority);
                }
            }

            Binder.restoreCallingIdentity(origId);
        }
    }

2.4 KeyDispatchTimeout

InpuntManagerService会起两个线程InputReader线程和InputDispatcherThread,InputDispatcherThread线程会由reader线程wake,起来后就threadloop不断循环读取input事件。
具体流程如下
在这里插入图片描述
图片来源于https://www.jianshu.com/p/fd376366e031?utm_campaign=maleskine&utm_content=note&utm_medium=seo_notes&utm_source=recommendation
这里我们主要看一下handleTargetsNotReadyLocked函数,
如下注释1处,当inputdispatcher收到一个事件之后,会将mInputTargetWaitTimeoutTime赋值为当前时间+timeout,然后当下一个input事件来临之时,用此时的当前时间和mInputTargetWaitTimeoutTime进行比对(注释2处),如果当前时间大于mInputTargetWaitTimeoutTime则触发ANR。

//默认5秒超时
constexpr nsecs_t DEFAULT_INPUT_DISPATCHING_TIMEOUT = 5000 * 1000000LL; // 5 sec

int32_t InputDispatcher::handleTargetsNotReadyLocked(
        nsecs_t currentTime, const EventEntry* entry,
        const sp<InputApplicationHandle>& applicationHandle,
        const sp<InputWindowHandle>& windowHandle, nsecs_t* nextWakeupTime, const char* reason) {
    if (applicationHandle == nullptr && windowHandle == nullptr) {
        if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY) {
#if DEBUG_FOCUS
            ALOGD("Waiting for system to become ready for input.  Reason: %s", reason);
#endif
            mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY;
            mInputTargetWaitStartTime = currentTime;
            mInputTargetWaitTimeoutTime = LONG_LONG_MAX;
            mInputTargetWaitTimeoutExpired = false;
            mInputTargetWaitApplicationToken.clear();
        }
    } else {
        if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY) {
#if DEBUG_FOCUS
            ALOGD("Waiting for application to become ready for input: %s.  Reason: %s",
                  getApplicationWindowLabel(applicationHandle, windowHandle).c_str(), reason);
#endif
            nsecs_t timeout;
            if (windowHandle != nullptr) {
                timeout = windowHandle->getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);
            } else if (applicationHandle != nullptr) {
                timeout =
                        applicationHandle->getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);
            } else {
                timeout = DEFAULT_INPUT_DISPATCHING_TIMEOUT;
            }

            mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY;
            mInputTargetWaitStartTime = currentTime;
            //1.当前时间+timeout
            mInputTargetWaitTimeoutTime = currentTime + timeout;
            mInputTargetWaitTimeoutExpired = false;
            mInputTargetWaitApplicationToken.clear();

            if (windowHandle != nullptr) {
                mInputTargetWaitApplicationToken = windowHandle->getApplicationToken();
            }
            if (mInputTargetWaitApplicationToken == nullptr && applicationHandle != nullptr) {
                mInputTargetWaitApplicationToken = applicationHandle->getApplicationToken();
            }
        }
    }

    if (mInputTargetWaitTimeoutExpired) {
        return INPUT_EVENT_INJECTION_TIMED_OUT;
    }
    //2.触发anr
    if (currentTime >= mInputTargetWaitTimeoutTime) {
        onANRLocked(currentTime, applicationHandle, windowHandle, entry->eventTime,
                    mInputTargetWaitStartTime, reason);

        // Force poll loop to wake up immediately on next iteration once we get the
        // ANR response back from the policy.
        *nextWakeupTime = LONG_LONG_MIN;
        return INPUT_EVENT_INJECTION_PENDING;
    } else {
        // Force poll loop to wake up when timeout is due.
        if (mInputTargetWaitTimeoutTime < *nextWakeupTime) {
            *nextWakeupTime = mInputTargetWaitTimeoutTime;
        }
        return INPUT_EVENT_INJECTION_PENDING;
    }
}

3.总结

对比这几种不同场景的anr,其实原理都是一样,就是在执行之前先埋一颗定时炸弹,如果能够在规定的时间之内执行完,那么就可以及时的拆除炸弹,否则就会引爆炸弹,产生anr。

参考:https://www.jianshu.com/p/fd376366e031?utm_campaign=maleskine&utm_content=note&utm_medium=seo_notes&utm_source=recommendation

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!