
https://github.com/Kyligence/ClickHouse/blob/clickhouse_backend/cmake/cpu_features.cmake
最后,操作系统自带的编译工具版本太低会阻塞clickshoue的编译,比如Clang 16.0+、cmake3.20+、ninja-build1.8.2+等等。而Clang属于LLVM项目,LLVM源码编译依赖gcc7.3+、Python3+等等。
[root@FelixZh]# cat etc/os-releaseNAME="openEuler"VERSION="22.03LTS"ID="openEuler"VERSION_ID="22.03"PRETTY_NAME="openEuler22.03 LTS"ANSI_COLOR="0;31"
yum install cmake ninja-build yasm nasm gcc g++ ccache[root@FelixZh mySourceCode]# cmake --versioncmake version 3.22.0[root@FelixZh mySourceCode]# g++ --versiong++ (GCC) 10.3.1[root@FelixZh mySourceCode]# ninja-build --version1.10.2
# cmakewget https://github.com/Kitware/CMake/archive/refs/tags/v3.22.3.tar.gztar -xvf CMake-3.22.3.tar.gzcd CMake-3.22.3/./configure –prefix=/usr/local/cmake-3.22.3make -j8make install
# gccwget https://ftp.gnu.org/gnu/gcc/gcc-11.5.0/gcc-11.5.0.tar.gztar -zxvf ./gcc-11.5.0.tar.gzcd gcc-11.5.0

如果网络不可达,可以手动下载。
https://gcc.gnu.org/pub/gcc/infrastructure/# 具体版本号可以查看cat contrib/download_prerequisitesgmp='gmp-6.1.0.tar.bz2'mpfr='mpfr-3.1.6.tar.bz2'mpc='mpc-1.0.3.tar.gz'isl='isl-0.18.tar.bz2'
下载完成上传到gcc-11.5.0/,执行./contrib/download_prerequisites
yum install -y lbzip2 gcc gcc-c++ gmp-devel mpfr-devel libmpc-devel isl-devel
./configure --prefix=/usr/local/gcc-11.5.0 --enable-languages=c,c++ --disable-multilibmake –j8 && make install

vim etc/profileexport GCC_HOME=/usr/local/gcc-11.5.0export PATH=${GCC_HOME}/bin:$PATH
LLVM编译
本文使用llvm19,如下:
wget https://github.com/llvm/llvm-project/releases/download/llvmorg-19.1.4/llvm-project-19.1.4.src.tar.xztar -xvf llvm-project-19.1.4.src.tar.xzcd llvm-project-19.1.4.srcmkdir buildcmake -S llvm -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr/local/llvm19 -DLLVM_ENABLE_PROJECTS='bolt;clang;clang-tools-extra;compiler-rt;lld;lldb;cross-project-tests;libclc;polly' -DLLVM_ENABLE_RUNTIMES=allcd build/ && ninja -j8ninja installexport PATH=/usr/local/llvm19/bin:$PATHexport CC=clang-19export CXX=clang++
验证效果如下:

Gluten编译
git clone -b v1.3.0 https://github.com/apache/incubator-gluten.git
backend使用clickhouse,可以执行build_clickhouse.sh编译,脚本会自动从Kyligence仓库下载指定commitID的ck,具体信息可见clickhouse.version文件:
[root@FelixZh incubator-gluten]# bash ./ep/build-clickhouse/src/build_clickhouse.sh/home/mySourceCode/incubator-glutenCH_ORG=KyligenceCH_BRANCH=rebase_ch/20250107CH_COMMIT=01d2a08fb01-- The C compiler identification is Clang 19.1.4-- The CXX compiler identification is Clang 19.1.4-- The ASM compiler identification is Clang with GNU-like command-line-- Found assembler: usr/local/llvm19/bin/clang-19

libch.so路径如下:
incubator-gluten/cpp-ch/build/utils/extern-local-engine/
然后,通过mvn继续编译Java部分代码,如下:
mvn clean install -Pbackends-clickhouse -Phadoop-3.2 -Pspark-3.3 -Dhadoop.version=3.2.3 -DskipTests -Dcheckstyle.skip -Pdelta

生成Jar路径如下:
backends-clickhouse/target/gluten-1.3.0-spark-3.3-jar-with-dependencies.jar
效果验证
配置spark-env.sh
export LD_PRELOAD="/opt/libch-1.3.0.so"
配置spark-defaults.conf
spark.sql.adaptive.enabled falsespark.shuffle.manager org.apache.spark.shuffle.sort.ColumnarShuffleManagerspark.sql.orc.impl nativespark.plugins org.apache.gluten.GlutenPluginspark.memory.offHeap.enabled truespark.memory.offHeap.size 4Gspark.executorEnv.LD_PRELOAD /opt/libch-1.3.0.sospark.gluten.sql.columnar.libpath opt/libch-1.3.0.sospark.gluten.sql.enable.native.validation false
通过spark-sql执行测试sql:
select * from test_orc;

文章转载自大数据从业者,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




