20 内存泄露导致崩溃的问题

发布于 2020-06-30 23:32:33

4g模块AIR720标准驱动+MYMQTT,用的是AT_DEVICE组件直接连的,没有用PPP组件,联网后时间长了就会导致内存溢出,短时间看不出来,尤其把天线拔下来去掉手机卡强制设备重联(重联使用的是网卡的SET DOWN,SET UP接口),运行几天就会出现内存占用越来越多无法释放,看起来应该是系统不断重联,每次重联内存释放不干净所致,谁有这方面经验可以解决。这样的结果最后就崩溃了死机了,崩溃时提示:

assertion failed at function:rt_mutex_take
[23:29:40.881]收←◆unction[rt_malloc] shall not be used in ISR
thread   pri  status      sp     stack size max used left tick  error
-------- ---  ------- ---------- ----------  ------  ---------- ---
mqttt0    20  suspend 0x00000168 0x00000a8c    62%   0x00000001 000
。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。
timer      4  suspend 0x000000a8 0x00000200    32%   0x00000009 000
main      10  suspend 0x000000e4 0x000006bc    57%   0x0000000b 000
06-20 23:29:39.527 E/cmb ISR: 
06-20 23:29:39.527 E/cmb ISR: (0) has assert failed at rt_malloc:280.
06-20 23:29:39.527 E/cmb ISR: 
06-20 23:29:39.527 E/cmb ISR: Firmware name: testff, hardware version: v1.01, software version: v1.02
06-20 23:29:39.527 E/cmb ISR: Assert on thread at_clnt
06-20 23:29:39.527 E/cmb ISR: ===== Thread stack information =====
06-20 23:29:39.527 E/cmb ISR:   addr: 20009300    data: 0803f654
06-20 23:29:39.527 E/cmb ISR:   addr: 20009304    data: 0804bdc4
06-20 23:29:39.527 E/cmb ISR:   addr: 20009308    data: 00000118
06-20 23:29:39.527 E/cmb ISR:   addr: 2000930c    data: 08001a8d
06-20 23:29:39.527 E/cmb ISR:   addr: 20009310    data: 0000002d
06-20 23:29:39.527 E/cmb ISR:   addr: 20009314    data: 00000118
06-20 23:29:39.527 E/cmb ISR:   addr: 20009318    data: 0804bdc4
06-20 23:29:39.527 E/cmb ISR:   addr: 2000931c    data: 0803f654
06-20 23:29:39.527 E/cmb ISR:   addr: 20009320    data: 00000009
06-20 23:29:39.527 E/cmb ISR:   addr: 20009324    data: 01008af0
06-20 23:29:39.527 E/cmb ISR:   addr: 20009328    data: 20009330
06-20 23:29:39.527 E/cmb ISR:   addr: 2000932c    data: 20008af0
06-20 23:29:39.527 E/cmb ISR:   addr: 20009330    data: 20009338
06-20 23:29:39.527 E/cmb ISR:   addr: 20009334    data: 080044a1
06-20 23:29:39.527 E/cmb ISR:   addr: 20009338    data: 014b2638
06-20 23:29:39.527 E/cmb ISR:   addr: 2000933c    data: 00000118
06-20 23:29:39.527 E/cmb ISR:   addr: 20009340    data: 0804bdc4
06-20 23:29:39.527 E/cmb ISR:   addr: 20009344    data: 0803f654
06-20 23:29:39.527 E/cmb ISR:   addr: 20009348    data: 0803f62c

经过分析,这个崩溃是由at_clnt(AT CLIENT组件)线程导致的,经过仔细分析发现运行一段时候后(至少1天以上,短时时间内看不出来)有内存泄露,如图:

[0x20009b60 -    12] at_c
[0x20009b7c -    20] mqtt
[0x20009ba0 -    64] mqtt
[0x20009bf0 -    24]     
[0x20009c18 -   628] at_c
[0x20009e9c -   568] at_c
[0x2000a0e4 -    64] mqtt
[0x2000a134 -    48] mqtt
[0x2000a174 -    64] mqtt
[0x2000a1c4 -    60] at_c
[0x2000a210 -    60] at_c
[0x2000a25c -    28] mqtt
[0x2000a288 -    1K] main
[0x2000a698 -    1K] main
[0x2000aaa8 -    56] main
[0x2000aaf0 -    36] main
[0x2000ab24 -    60] main
[0x2000ab70 -    64] main
[0x2000abc0 -   128] main
[0x2000ac50 -    2K] main
[0x2000b6ec -    36] Make
[0x2000b720 -    44] Make
[0x2000b75c -    36] Timi
[0x2000b790 -    60] at_c
[0x2000b7dc -    92] at_c
[0x2000b848 -    64]     
[0x2000b898 -    32] mqtt
[0x2000b8c8 -    36] mqtt
[0x2000b8fc -    12]     
[0x2000b918 -    36] mqtt
[0x2000b94c -   128]     
[0x2000b9dc -    64] mqtt
[0x2000ba2c -    44]     
[0x2000ba68 -    44] Equi
[0x2000baa4 -    44] Equi
[0x2000bae0 -    44] Equi
[0x2000bb1c -   568] at_c
[0x2000bd64 -   568] at_c
[0x2000bfac -   568] at_c
[0x2000c1f4 -    2K]     
[0x2000ca04 -    64] mqtt

里面明显看到at_c(也就是at_clnt)不断增多,而且不释放,就算后面正常联网了也不释放,这里推测,这里正常设备经常掉线重联导致内存泄露越来越多最后崩溃

查看更多

关注者
0
被浏览
199
5 个回答
JYFP_3506
JYFP_3506 2020-06-30

用其他的4G模块也是一样有这个问题,使用的AT 组件版本和配置是:

#define PKG_USING_AT_DEVICE
#define AT_DEVICE_USING_AIR720
#define AT_DEVICE_AIR720_INIT_ASYN
#define AT_DEVICE_AIR720_SAMPLE
#define AIR720_SAMPLE_POWER_PIN 16
#define AIR720_SAMPLE_STATUS_PIN -1
#define AIR720_SAMPLE_CLIENT_NAME "uart2"
#define AIR720_SAMPLE_RECV_BUFF_LEN 512
#define PKG_USING_AT_DEVICE_LATEST_VERSION
#define PKG_AT_DEVICE_VER_NUM 0x99999

#define RT_USING_AT
#define AT_USING_CLIENT
#define AT_CLIENT_NUM_MAX 1
#define AT_USING_SOCKET
#define AT_USING_CLI
#define AT_CMD_MAX_LEN 128
#define AT_SW_VERSION_NUM 0x10300

那个崩溃提示,表面看起来是在中断内调用rt_mutex_take,但是实际上看了代码程序里面所有中断都没有调用rt_mutex_take,如果调用的话早就崩溃了,不可能运行5,6天才崩溃

JYFP_3506
JYFP_3506 2020-06-30

有时候崩溃提示下面的内容:

Function[rt_mutex_take] shall not be used in ISR
thread   pri  status      sp     stack size max used left tick  error
-------- ---  ------- ---------- ----------  ------  ---------- ---
sim0_li   10  suspend 0x000000d8 0x0000052c    80%   0x00000004 000
mqttt0    20  suspend 0x0000016c 0x00000a8c    73%   0x00000002 000
at_clnt    9  suspend 0x00000104 0x00000600    72%   0x00000005 000
.................................................
..............................................
timer      4  suspend 0x000000a8 0x00000200    32%   0x00000009 000
main      10  suspend 0x000000e0 0x000006bc    60%   0x00000001 000
05-19 04:43:09.053 E/cmb ISR: 
05-19 04:43:09.053 E/cmb ISR: (0) has assert failed at rt_mutex_take:662.
total memory: 47020
used memory : 38108
maximum allocated memory: 38188
05-19 04:43:09.053 E/cmb ISR: 
05-19 04:43:09.053 E/cmb ISR: Firmware name: testff, hardware version: v1.01, software version: v1.02
05-19 04:43:09.053 E/cmb ISR: Assert on thread Communic
05-19 04:43:09.053 E/cmb ISR: ===== Thread stack information =====
05-19 04:43:09.053 E/cmb ISR:   addr: 20008d88    data: 08040abc
05-19 04:43:09.053 E/cmb ISR:   addr: 20008d8c    data: 0804dccc
05-19 04:43:09.053 E/cmb ISR:   addr: 20008d90    data: 00000296
05-19 04:43:09.053 E/cmb ISR:   addr: 20008d94    data: 08001a8d
05-19 04:43:09.053 E/cmb ISR:   addr: 20008d98    data: 00000031
05-19 04:43:09.053 E/cmb ISR:   addr: 20008d9c    data: 00000296
KAWAII-MQTT
这个mqtt软件包好用,官方的umqtt也好用,但功能还不完善
Cheney_Chen
Cheney_Chen 2020-07-02
  1. Function[rt_mutex_take] shall not be used in ISR 确定是否开始了软件定时器功能支持
  2. 因为涉及到 at_device 和 mymqtt 软件包,所以不太确定是那个软件包导致的重连的时候内存泄露,建议单独测试功能;
  3. 建议单独使用 at_device 软件包,编写简单的网卡 UP 和 DOWM 操作测试接口,多次调用后看是否存在内存泄露问题;单独测试 mymqtt 软件包重连功能是是否存在内存泄露问题;
JQRR_7669
JQRR_7669 认证专家 1 day ago

at_device_air720.c文件air720_netdev_add函数中,在

RT_ASSERT(netdev_name);

下面加入代码

netdev = netdev_get_by_name(netdev_name);
if (netdev != RT_NULL)
{
    return (netdev);
}

应该就能解决问题。

撰写答案

请登录后再发布答案,点击登录

发布
问题

分享
好友