我正在尝试使用
R
中的 kmeans
功能开发一个 RcppMLPACK
包。
我包括下面的标题部分:
#include <RcppArmadillo.h>
#include <RcppMLPACK.h>
#include <RcppGSL.h>
#include <RcppDist.h>
#include <sstream>
#include <iostream>
#include <fstream>
#include<omp.h>
#include<gsl/gsl_math.h>
#include<gsl/gsl_rng.h>
#include<gsl/gsl_randist.h>
#include<gsl/gsl_sf.h>
// [[Rcpp::depends(RcppProgress)]]
#include <progress.hpp>
#include <progress_bar.hpp>
// [[Rcpp::depends(RcppArmadillo,RcppDist)]]
// [[Rcpp::depends(RcppMLPACK)]]
// [[Rcpp::depends(RcppGSL)]]
// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::plugins(openmp)]]
using namespace mlpack::kmeans ;
using namespace arma;
我的
Makevars
文件正文如下:
CXX_STD = CXX17
GSL_CFLAGS=`${R_HOME}/bin/Rscript -e "RcppGSL:::CFlags()" 4`
GSL_LIBS=`${R_HOME}/bin/Rscript -e "RcppGSL:::LdFlags()"`
RCPP_LDFLAGS=`${R_HOME}/bin/Rscript -e "Rcpp:::LdFlags()"`
PKG_CXXFLAGS = $(SHLIB_OPENMP_CXXFLAGS) $(GSL_CFLAGS)
PKG_LIBS = $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) $(GSL_LIBS) $(RCPP_LDFLAGS)
我正在使用 macOS ventura。当我尝试构建我的
R
包时,它显示以下错误
In file included from /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library/RcppMLPACK/include/mlpack/core.hpp:171,
from /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library/RcppMLPACK/include/RcppMLPACK.h:4,
from RcppExports.cpp:6:
/Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library/RcppMLPACK/include/mlpack/prereqs.hpp:46:10: fatal error: boost/math/special_functions/gamma.hpp: No such file or directory
>> 46 | #include <boost/math/special_functions/gamma.hpp>
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
但是,如果我只是在
Rcpp::sourcecpp
文件上 C++
,那么它就可以完美编译。请帮助我调试问题。
附注我使用
gcc
而不是 clang
。 boost
和 mlpack
都安装在我的系统中。
这个主题的记录有点少:
mlpack
是一个大包,包含很多内容,但是 R 没有“快速启动”。同时,您的问题可能会因为包含几个厨房水槽的包含库而使事情变得过于复杂。我发现过早添加太多会使事情变得混乱。
这就是我所做的(使用mlpack 3.4.2,请参阅下面的mlpack 4.0.1):
我首先创建了一个 minimal C++ 文件,仅包含两个标头,没有做太多事情。
它看起来像这样,给予或接受:
#include <Rcpp/Rcpp>
#include <mlpack.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::depends(mlpack)]]
// [[Rcpp::export]]
void foo() {
Rcpp::Rcout << "Foo\n";
}
/*** R
foo()
*/
编译这意味着找到
mlpack
。我已经安装了 CRAN 软件包。
这对我来说有点复杂,因为我碰巧(主要)在 Ubuntu 22.10 上工作,它只有一个较旧的 mlpack 3.4.2
作为发行版中方便的系统库。我认为使用较新的
mlpack
版本4.*我不需要链接。正如我经常做的那样,我从单元测试中举了一个简单的例子。它有数据,以及调用。现在完整的文件如下:
#include <Rcpp/Rcpp>
#include <mlpack.h>
// Two include directories adjusted for my use of mlpack 3.4.2 on Ubuntu
#include <mlpack/core.hpp>
#include <mlpack/methods/kmeans/kmeans.hpp>
#include <mlpack/methods/kmeans/random_partition.hpp>
#include <mlpack/methods/neighbor_search/neighbor_search.hpp>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::depends(mlpack)]]
// This is 'borrowed' from mlpack's own src/mlpack/tests/kmeans_test.cpp
// and src/mlpack/tests/kmeans_test.cpp. We borrow the data set, and the
// code from the first test function. Passing data from R in easy thanks
// to RcppArmadillo, 'and left as an exercise'.
// Generate dataset; written transposed because it's easier to read.
arma::mat kMeansData(" 0.0 0.0;" // Class 1.
" 0.3 0.4;"
" 0.1 0.0;"
" 0.1 0.3;"
" -0.2 -0.2;"
" -0.1 0.3;"
" -0.4 0.1;"
" 0.2 -0.1;"
" 0.3 0.0;"
" -0.3 -0.3;"
" 0.1 -0.1;"
" 0.2 -0.3;"
" -0.3 0.2;"
" 10.0 10.0;" // Class 2.
" 10.1 9.9;"
" 9.9 10.0;"
" 10.2 9.7;"
" 10.2 9.8;"
" 9.7 10.3;"
" 9.9 10.1;"
"-10.0 5.0;" // Class 3.
" -9.8 5.1;"
" -9.9 4.9;"
"-10.0 4.9;"
"-10.2 5.2;"
"-10.1 5.1;"
"-10.3 5.3;"
"-10.0 4.8;"
" -9.6 5.0;"
" -9.8 5.1;");
// [[Rcpp::export]]
arma::Row<size_t> kmeansDemo() {
mlpack::kmeans::KMeans<mlpack::metric::EuclideanDistance,
mlpack::kmeans::RandomPartition> kmeans;
arma::Row<size_t> assignments;
kmeans.Cluster((arma::mat) trans(kMeansData), 3, assignments);
return assignments;
}
/*** R
kmeansDemo()
*/
现在,因为我使用的是mlpack 3.4.2我必须链接,所以我还需要运行Sys.setenv("PKG_LIBS"="-lmlpack")
——并且我必须从为mlpack设置的存储库中获取的示例中稍微调整标题4.1.*.链接步骤将根据您运行此链接的位置而有所不同。
但是这样,我的 R 会话就产生了结果:
> Sys.setenv("PKG_LIBS"="-lmlpack")
> Rcpp::sourceCpp("~/git/stackoverflow/76319284/answer.cpp")
> kmeansDemo()
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30]
[1,] 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
>
运行 kmeans (mlpack 4.0.1)mlpack
,事情会变得更好、更容易。在 Ubuntu 上安装 4.0.1 后,头文件包含稍微简化了一点,命名空间发生了一点变化,我添加了对RcppEnsmallen 的 R 包依赖(它提供了优化例程)。最重要的是,我可以在不链接的情况下构建它。 更新了代码(针对 mlpack 4.0.1)
#include <Rcpp/Rcpp>
#include <mlpack.h>
// Adjusted for mlpack 4.0.1
#include <mlpack/core.hpp>
#include <mlpack/methods/kmeans.hpp>
#include <mlpack/methods/kmeans/random_partition.hpp>
#include <mlpack/methods/neighbor_search/neighbor_search.hpp>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::depends(RcppEnsmallen)]]
// [[Rcpp::depends(mlpack)]]
// [[Rcpp::plugins(cpp14)]]
// This is 'borrowed' from mlpack's own src/mlpack/tests/kmeans_test.cpp
// and src/mlpack/tests/kmeans_test.cpp. We borrow the data set, and the
// code from the first test function. Passing data from R in easy thanks
// to RcppArmadillo, 'and left as an exercise'.
// Generate dataset; written transposed because it's easier to read.
arma::mat kMeansData(" 0.0 0.0;" // Class 1.
" 0.3 0.4;"
" 0.1 0.0;"
" 0.1 0.3;"
" -0.2 -0.2;"
" -0.1 0.3;"
" -0.4 0.1;"
" 0.2 -0.1;"
" 0.3 0.0;"
" -0.3 -0.3;"
" 0.1 -0.1;"
" 0.2 -0.3;"
" -0.3 0.2;"
" 10.0 10.0;" // Class 2.
" 10.1 9.9;"
" 9.9 10.0;"
" 10.2 9.7;"
" 10.2 9.8;"
" 9.7 10.3;"
" 9.9 10.1;"
"-10.0 5.0;" // Class 3.
" -9.8 5.1;"
" -9.9 4.9;"
"-10.0 4.9;"
"-10.2 5.2;"
"-10.1 5.1;"
"-10.3 5.3;"
"-10.0 4.8;"
" -9.6 5.0;"
" -9.8 5.1;");
// [[Rcpp::export]]
arma::Row<size_t> kmeansDemo() {
mlpack::KMeans<mlpack::EuclideanDistance, mlpack::RandomPartition> kmeans;
arma::Row<size_t> assignments;
kmeans.Cluster((arma::mat) trans(kMeansData), 3, assignments);
return assignments;
}
/*** R
kmeansDemo()
*/
它的构建和运行当然是相同的,现在包含少量的默认日志记录:
> Rcpp::sourceCpp("answer.cpp")
> kmeansDemo()
[INFO ] KMeans::Cluster(): iteration 1, residual 14.8221.
[INFO ] KMeans::Cluster(): iteration 2, residual 1.77636e-15.
[INFO ] KMeans::Cluster(): converged after 2 iterations.
[INFO ] 186 distance calculations.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30]
[1,] 2 2 2 2 2 2 2 2 2 2 2 2 2 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
>