暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

定位线上问题不求人

codeImport 2020-03-28
549

前言

我曾经在生产环境遇到一个问题,项目启动很久都没有启动成功,查看日志也没有任何报错,类似于假死的现象。这种问题很多情况下是由于启动时调用第三方服务(连接数据库,连接redis,连接cat)出现了问题,但是又没有设置超时导致的。 

解决办法一般如下: 

  • 使用阿里巴巴开源的Arthas,这个功能很强大,但是当时你的机器上有没有这个插件是个问题,能不能装这个插件又是另一个问题。

  • 使用JVM自带的Jstack,哈哈,那你得祈祷你跟运维GG的关系足够铁。

如何解决

其实JDK是有相关命令获取当时的线程堆栈的,解决思路就是,监听SpringBoot启动事件,如果超过一段时间还没有启动成功,就直接打印Main主线程的堆栈,直到启动成功。

废话不多说了,一个类就能搞定,代码如下:

package com.github.codeimport.vm.beans;


import com.sun.tools.attach.VirtualMachine;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.boot.context.event.ApplicationStartedEvent;
import org.springframework.boot.context.event.ApplicationStartingEvent;
import org.springframework.context.ApplicationEvent;
import org.springframework.context.ApplicationListener;
import sun.tools.attach.HotSpotVirtualMachine;


import java.io.BufferedReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.lang.management.ManagementFactory;
import java.util.concurrent.*;


/**
* @description: 这个类要放入META-INF/spring.factories中,否则无法监听到ApplicationStartingEvent事件
* @author: chengang6
**/
public class VmApplicationListener implements ApplicationListener {


    private Logger logger = LoggerFactory.getLogger(this.getClass());


/**
     * Main线程名字,我们只关注这个线程
*/
    private final String MAIN_THREAD_NAME = "CodeImportMain";
/**
* 打印线程堆栈的行数
*/
    private final int PRINT_LINES = 20;
    private ScheduledThreadPoolExecutor executorService = new ScheduledThreadPoolExecutor(1new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread thread = new Thread(r);
thread.setDaemon(true);
return thread;
}
});


@Override
public void onApplicationEvent(ApplicationEvent applicationEvent) {
if (applicationEvent instanceof ApplicationStartingEvent) {
Thread.currentThread().setName(MAIN_THREAD_NAME);
startSchedule();
} else if (applicationEvent instanceof ApplicationStartedEvent) {
executorService.shutdown();
}
}


private void startSchedule() {
executorService.scheduleWithFixedDelay(new Runnable() {
@Override
public void run() {
try {
stack();
} catch (Exception ex) {
ex.printStackTrace();
}
}
}, 10000, 10000, TimeUnit.MILLISECONDS);
}


private void stack() throws Exception {
// 首先获取到当前系统的pid
String name = ManagementFactory.getRuntimeMXBean().getName();
String pid = name.split("@")[0];
VirtualMachine vm = VirtualMachine.attach(pid);


// 执行JCMD命令获取线程堆栈
try (InputStream initialStream = ((HotSpotVirtualMachine) vm).executeJCmd("Thread.print");
BufferedReader reader = new BufferedReader(new InputStreamReader(initialStream))) {


while (true) {
String line = reader.readLine();
// 只打印Main线程
if (line != null && line.indexOf(MAIN_THREAD_NAME) > 0) {
logger.warn(line);
// 只打印Main线程的前20行
for (int i = 0; i < PRINT_LINES; i++) {
String str = reader.readLine();
logger.warn(str);
}
} else if (line == null) {
break;
}
}
}


vm.detach();
}
}
package com.github.codeimport.vm;


import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;


import java.util.concurrent.TimeUnit;


/**
* @author: chengang6
**/
@SpringBootApplication
public class VmApplication {
public static void main(String[] args) {
SpringApplication.run(VmApplication.class, args);
    }
    
    @Bean
    public String timeOutBean() {
// 模拟启动时假死的问题
try {
TimeUnit.MILLISECONDS.sleep(500000);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "timeOut";
}


}


输出的日志如下,我们从日志可以很容易看出是卡在VmApplication.timeOutBean这一步了

2020-03-28 16:26:58.375  WARN 17692 --- [       Thread-1] c.g.c.vm.beans.VmApplicationListener     : "CodeImportMain" #1 prio=5 os_prio=0 tid=0x0000000003c36000 nid=0x51ac waiting on condition [0x00000000039bd000]
2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : java.lang.Thread.State: TIMED_WAITING (sleeping)
2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at java.lang.Thread.sleep(Native Method)
2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at java.lang.Thread.sleep(Thread.java:340)
2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:386)
2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at com.github.codeimport.vm.VmApplication.timeOutBean(VmApplication.java:22)
2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at com.github.codeimport.vm.VmApplication$$EnhancerBySpringCGLIB$$26aca632.CGLIB$timeOutBean$0(<generated>)
2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at com.github.codeimport.vm.VmApplication$$EnhancerBySpringCGLIB$$26aca632$$FastClassBySpringCGLIB$$e52ca13.invoke(<generated>)
2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(ConfigurationClassEnhancer.java:365)
2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at com.github.codeimport.vm.VmApplication$$EnhancerBySpringCGLIB$$26aca632.timeOutBean(<generated>)
2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at java.lang.reflect.Method.invoke(Method.java:498)
2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154)
2020-03-28 16:26:58.377 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:583)
2020-03-28 16:26:58.377 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1246)
2020-03-28 16:26:58.377 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1096)
2020-03-28 16:26:58.377 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:535)
2020-03-28 16:26:58.377 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:495)
2020-03-28 16:27:08.384 WARN 17692 --- [ Thread-1]


总结

JCMD这个命令很强大,我们可以使用它来解决很多棘手的问题,比如:周末你正在喝着啤酒唱着歌,突然收到线上CPU利用率100%的告警,等你好不容器登陆服务器的时候,CPU又降下来了。。。

References

[1]
 JCMD命令: 

https://jianyuan.me/2018/08/02/jcmd-intro/


近期热文

同一个事务操作多个数据库

聊一聊熟悉又陌生的autoCommit

Dubbo异步调用

文章转载自codeImport,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论