前言
我曾经在生产环境遇到一个问题,项目启动很久都没有启动成功,查看日志也没有任何报错,类似于假死的现象。这种问题很多情况下是由于启动时调用第三方服务(连接数据库,连接redis,连接cat)出现了问题,但是又没有设置超时导致的。
解决办法一般如下:
使用阿里巴巴开源的Arthas,这个功能很强大,但是当时你的机器上有没有这个插件是个问题,能不能装这个插件又是另一个问题。
使用JVM自带的Jstack,哈哈,那你得祈祷你跟运维GG的关系足够铁。
如何解决
其实JDK是有相关命令获取当时的线程堆栈的,解决思路就是,监听SpringBoot启动事件,如果超过一段时间还没有启动成功,就直接打印Main主线程的堆栈,直到启动成功。
废话不多说了,一个类就能搞定,代码如下:
package com.github.codeimport.vm.beans;import com.sun.tools.attach.VirtualMachine;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import org.springframework.boot.context.event.ApplicationStartedEvent;import org.springframework.boot.context.event.ApplicationStartingEvent;import org.springframework.context.ApplicationEvent;import org.springframework.context.ApplicationListener;import sun.tools.attach.HotSpotVirtualMachine;import java.io.BufferedReader;import java.io.InputStream;import java.io.InputStreamReader;import java.lang.management.ManagementFactory;import java.util.concurrent.*;/*** @description: 这个类要放入META-INF/spring.factories中,否则无法监听到ApplicationStartingEvent事件* @author: chengang6**/public class VmApplicationListener implements ApplicationListener {private Logger logger = LoggerFactory.getLogger(this.getClass());/*** Main线程名字,我们只关注这个线程*/private final String MAIN_THREAD_NAME = "CodeImportMain";/*** 打印线程堆栈的行数*/private final int PRINT_LINES = 20;private ScheduledThreadPoolExecutor executorService = new ScheduledThreadPoolExecutor(1, new ThreadFactory() {@Overridepublic Thread newThread(Runnable r) {Thread thread = new Thread(r);thread.setDaemon(true);return thread;}});@Overridepublic void onApplicationEvent(ApplicationEvent applicationEvent) {if (applicationEvent instanceof ApplicationStartingEvent) {Thread.currentThread().setName(MAIN_THREAD_NAME);startSchedule();} else if (applicationEvent instanceof ApplicationStartedEvent) {executorService.shutdown();}}private void startSchedule() {executorService.scheduleWithFixedDelay(new Runnable() {@Overridepublic void run() {try {stack();} catch (Exception ex) {ex.printStackTrace();}}}, 10000, 10000, TimeUnit.MILLISECONDS);}private void stack() throws Exception {// 首先获取到当前系统的pidString name = ManagementFactory.getRuntimeMXBean().getName();String pid = name.split("@")[0];VirtualMachine vm = VirtualMachine.attach(pid);// 执行JCMD命令获取线程堆栈try (InputStream initialStream = ((HotSpotVirtualMachine) vm).executeJCmd("Thread.print");BufferedReader reader = new BufferedReader(new InputStreamReader(initialStream))) {while (true) {String line = reader.readLine();// 只打印Main线程if (line != null && line.indexOf(MAIN_THREAD_NAME) > 0) {logger.warn(line);// 只打印Main线程的前20行for (int i = 0; i < PRINT_LINES; i++) {String str = reader.readLine();logger.warn(str);}} else if (line == null) {break;}}}vm.detach();}}
package com.github.codeimport.vm;import org.springframework.boot.SpringApplication;import org.springframework.boot.autoconfigure.SpringBootApplication;import org.springframework.context.annotation.Bean;import java.util.concurrent.TimeUnit;/*** @author: chengang6**/@SpringBootApplicationpublic class VmApplication {public static void main(String[] args) {SpringApplication.run(VmApplication.class, args);}@Beanpublic String timeOutBean() {// 模拟启动时假死的问题try {TimeUnit.MILLISECONDS.sleep(500000);} catch (InterruptedException e) {e.printStackTrace();}return "timeOut";}}
输出的日志如下,我们从日志可以很容易看出是卡在VmApplication.timeOutBean这一步了
2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : "CodeImportMain" #1 prio=5 os_prio=0 tid=0x0000000003c36000 nid=0x51ac waiting on condition [0x00000000039bd000]2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : java.lang.Thread.State: TIMED_WAITING (sleeping)2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at java.lang.Thread.sleep(Native Method)2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at java.lang.Thread.sleep(Thread.java:340)2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:386)2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at com.github.codeimport.vm.VmApplication.timeOutBean(VmApplication.java:22)2020-03-28 16:26:58.375 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at com.github.codeimport.vm.VmApplication$$EnhancerBySpringCGLIB$$26aca632.CGLIB$timeOutBean$0(<generated>)2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at com.github.codeimport.vm.VmApplication$$EnhancerBySpringCGLIB$$26aca632$$FastClassBySpringCGLIB$$e52ca13.invoke(<generated>)2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(ConfigurationClassEnhancer.java:365)2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at com.github.codeimport.vm.VmApplication$$EnhancerBySpringCGLIB$$26aca632.timeOutBean(<generated>)2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at java.lang.reflect.Method.invoke(Method.java:498)2020-03-28 16:26:58.376 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154)2020-03-28 16:26:58.377 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:583)2020-03-28 16:26:58.377 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1246)2020-03-28 16:26:58.377 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1096)2020-03-28 16:26:58.377 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:535)2020-03-28 16:26:58.377 WARN 17692 --- [ Thread-1] c.g.c.vm.beans.VmApplicationListener : at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:495)2020-03-28 16:27:08.384 WARN 17692 --- [ Thread-1]
总结
JCMD这个命令很强大,我们可以使用它来解决很多棘手的问题,比如:周末你正在喝着啤酒唱着歌,突然收到线上CPU利用率100%的告警,等你好不容器登陆服务器的时候,CPU又降下来了。。。
References
[1]
JCMD命令:
https://jianyuan.me/2018/08/02/jcmd-intro/
近期热文
文章转载自codeImport,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




