近期评论

图数据库及图计算

概述

arangodb一种内存数据,一种基于document的nosql数据库。

比起neo4j性能和设计都要好很多。用户界面也好很多。支持单点模式和集群模式。

相比传统的数据库,和新型的nosql数据,图数据库有一个非常特别的地方,就是图计算,这是它的优势。

劣势在于如何把现实生活的数据抽象成图数据,也就是建模过程。这个反而更难些。

这里做一个开端,普及一些基础的图计算知识。

部署

下载源码

官方3.0,3.2这两个版本的源码没法编译成功。3.1.10版本存在bug,选择3.1.10.1版本。

git clone \

-b 3.1.10.1 \

https://github.com/arangodb/arangodb.git  \

arangodb-3.1.10.1

编译安装

编译arangodb的源码需要用到4.9以上的gcc,gcc-c++和python-argparse

yum -y install python-argparse

BASE_DIR=/usr/local/src/arangodb-3.1.10.1/build

rm -rf $BASE_DIR

mkdir -p $BASE_DIR

cd $BASE_DIR

export LD_LIBRARY_PATH=/usr/local/gcc-4.9.3/lib64/

rm -rf /usr/local/arangodb-3.1.10.1

mkdir /usr/local/arangodb-3.1.10.1

ln -s /usr/local/arangodb-3.1.10.1 /usr/local/arangodb

/usr/local/cmake-3.5.2/bin/cmake ../ \

-DCMAKE_INSTALL_PREFIX=/usr/local/arangodb-3.1.10.1 \

-DPCH=1 \

-DDEBUG=0 \

-DCMAKE_C_COMPILER=/usr/local/gcc-4.9.3/bin/gcc \

-DCMAKE_CXX_COMPILER=/usr/local/gcc-4.9.3/bin/g++

time make -j14

make install

基础环境

arangodb集群默认要求使用arangdb用户启动和关闭

groupadd arangodb

useradd arangodb -g arangodb

chown -R arangodb.arangodb /usr/local/arangodb*

mkdir -p /usr/local/arangodb/{data,logs,pids,tmp}

mkdir -p /usr/local/arangodb/tmp/{agency,primary,coordinator}

mkdir -p /usr/local/arangodb/data/{agency,primary,coordinator}

chown -R arangodb.arangodb /usr/local/arangodb*

集群模式

初始化和重启

关闭脚本

对应关闭,检查和启动的脚本如下:

cat stop_coordinator.sh

#!/bin/sh

# author  : tanghaiyang

# date    : Fri Jun 23 07:17:22 EDT 2017

# usage   : sh stop_cordinator.sh

NODE=coordinator

ps -ef |grep arangodb |grep $NODE |grep -v grep | grep -v stop | awk ‘{print $2}’ |xargs -t -i kill {}

rm -rf /usr/local/arangodb/pids/$NODE.pid

# 检查进程是否存在

ps -ef |grep arangodb |grep $NODE |grep -v grep | grep -v stop

echo “stop $NODE finished!”

———————————————————————

cat stop_primary.sh

#!/bin/sh

# author  : tanghaiyang

# date    : Fri Jun 23 07:17:22 EDT 2017

# usage   : sh stop_primary.sh

NODE=primary

ps -ef |grep arangodb |grep $NODE |grep -v grep |grep -v stop | awk ‘{print $2}’ |xargs -t -i kill {}

rm -rf /usr/local/arangodb/pids/$NODE.pid

# 检查进程是否存在

ps -ef |grep arangodb |grep $NODE |grep -v grep |grep -v stop

echo “stop $NODE finished!”

———————————————————————

cat stop_agency.sh

#!/bin/sh

# author  : tanghaiyang

# date    : Fri Jun 23 07:17:22 EDT 2017

# usage   : sh stop_agency.sh

NODE=agency

ps -ef |grep arangodb |grep $NODE |grep -v grep |grep -v stop| awk ‘{print $2}’ |xargs -t -i kill {}

rm -rf /usr/local/arangodb/pids/$NODE.pid

# 检查进程是否存在

ps -ef |grep arangodb |grep $NODE |grep -v grep |grep -v stop

echo “stop $NODE finished!”

检查脚本

ps -ef |grep arango |grep -v su |grep -v bash |grep -v grep |grep -v ps

启动脚本

cat start_agency.sh

#!/bin/sh

# author  : tanghaiyang

# date    : Fri Jun 23 07:17:22 EDT 2017

# usage   : sh start_agency.sh

NODE=agency

export LD_LIBRARY_PATH=/usr/local/arangodb/lib64/

/usr/local/arangodb/sbin/arangod –configuration /usr/local/arangodb/etc/arangodb3/roles/$NODE.conf

echo “start $NODE finished!”

———————————————————————

cat start_primary.sh

#!/bin/sh

# author  : tanghaiyang

# date    : Fri Jun 23 07:17:22 EDT 2017

# usage   : sh start_primary.sh

NODE=primary

export LD_LIBRARY_PATH=/usr/local/arangodb/lib64/

/usr/local/arangodb/sbin/arangod –configuration /usr/local/arangodb/etc/arangodb3/roles/$NODE.conf

ps -ef |grep $NODE |grep -v grep |grep -v start

echo “start $NODE finished!”

———————————————————————

cat start_coordinator.sh

#!/bin/sh

# author  : tanghaiyang

# date    : Fri Jun 23 07:17:22 EDT 2017

# usage   : sh start_cordinator.sh

NODE=coordinator

export LD_LIBRARY_PATH=/usr/local/arangodb/lib64/

/usr/local/arangodb/sbin/arangod –configuration /usr/local/arangodb/etc/arangodb3/roles/$NODE.conf

echo “start $NODE finished!”

关闭arngodb集群

先关闭coordinator节点,再关闭primary节点,最后关闭agency节点

stopcoordinator

stopprimary

stopagency

# 检查配置文件

cat  /usr/local/arangodb/etc/arangodb3/roles/primary.conf |egrep ‘endpoint|my-add’ |grep -v ^# |grep -v agency

初始化清理数据文件

首次启动时,或者反复安装时,或者数据文件破坏,或者已经备份,需要恢复集群时,可以考虑清理数据文件,执行如下命令:

rm -rf /usr/local/arangodb/logs/*

rm -rf /usr/local/arangodb/data

rm -rf /usr/local/arangodb/pids/*

rm -rf /usr/local/arangodb/var/lib/arangodb3-apps/*

rm -rf /tmp/arangod_*

mkdir -p /usr/local/arangodb/data/{agency,coordinator,primary}

chown -R arangodb.arangodb /usr/local/arangodb*

启动arangodb

启动arangodb时,先启动agency,然后启动primary,最后启动coordinator节点

agency类似zookeeper,primary向agency注册集群信息。

startagency

startprimary

startcoordinator

配置文件

agency节点

3.0.12版本的agency配置文件中有个配置项id,

但是3.1.10以后版本没有了,不能在3.1.10版本里面存在这个配置项。

示例配置如下(配置项必须包含[agency]模块,否则primary和coordinator节点无法启动,2017.8.5踩到这个坑,直到第二天仔细检查配置才发现,复制粘贴时漏掉了):

vim /usr/local/arangodb/etc/arangodb3/roles/agency.conf

##############################

supervisor = true

pid-file = /usr/local/arangodb/pids/agency.pid

[database]

directory = /usr/local/arangodb/data/agency

[server]

endpoint = tcp://dm0:5020

authentication = false

statistics = true

uid = arangodb

jwt-secret = 123456

[scheduler]

[javascript]

startup-directory = /usr/local/arangodb/share/arangodb3/js

app-path = /usr/local/arangodb/var/lib/arangodb3-apps

[foxx]

[log]

level = info

file = /usr/local/arangodb/logs/agency.log

use-local-time = true

[temp]

path = /usr/local/arangodb/tmp/agency

[agency]

activate = true

size = 1

supervision = true

primary节点

primary的配置文件除了模块的下面三项不一样以外,其他配置项配每台都一样:

[server]

endpoint = tcp://dm0:18558

……

[cluster]

my-address = tcp://dm0:18558

my-local-info = primary_dm0

注意:

不能添加my-id,如果增加了my-id,coordinator节点无法启动,debug日志,提示endpoint无效

另外agency的配置每台都一样

agency-endpoint = tcp://dm0:5020

示例配置如下:

/usr/local/arangodb/etc/arangodb3/roles/primary.conf

##############################

supervisor = true

pid-file = /usr/local/arangodb/pids/primary.pid

[database]

directory = /usr/local/arangodb/data/primary

[server]

endpoint = tcp://dm0:18558

authentication = false

statistics = true

uid = arangodb

jwt-secret = 123456

[scheduler]

[javascript]

startup-directory = /usr/local/arangodb/share/arangodb3/js

app-path = /usr/local/arangodb/var/lib/arangodb3-apps

[foxx]

[log]

level = info

file = /usr/local/arangodb/logs/primary.log

[temp]

path = /usr/local/arangodb/tmp/primary

[cluster]

my-role = PRIMARY

my-address = tcp://dm0:18558

my-local-info = primary_dm0

agency-endpoint = tcp://dm0:5020

coordinator节点

coordinator节点也可以有多个节点。

示例配置如下:

/usr/local/arangodb/etc/arangodb3/roles/coordinator.conf

##############################

supervisor = true

pid-file = /usr/local/arangodb/pids/coordinator.pid

[database]

directory = /usr/local/arangodb/data/coordinator

[server]

endpoint = tcp://dm0:18530

authentication = false

statistics = true

uid = arangodb

jwt-secret = 123456

[scheduler]

[javascript]

startup-directory = /usr/local/arangodb/share/arangodb3/js

app-path = /usr/local/arangodb/var/lib/arangodb3-apps

v8-contexts = 10

[foxx]

[log]

level = info

file = /usr/local/arangodb/logs/coorniantor.log

[cluster]

[temp]

path = /usr/local/arangodb/tmp/coorniantor

[cluster]

my-role = COORDINATOR

my-address = tcp://dm0:18530

my-local-info = coord_171

agency-endpoint = tcp://dm0:5020

单点模式

单点模式的web界面左侧没有NODE一栏,对应服务器只有一个arangod进程。

启动脚本也很简单,如下:

vim /usr/local/arangodb/bin/start_single.sh

#!/bin/sh

export LD_LIBRARY_PATH=/usr/local/arangodb/lib64/

/usr/local/arangodb/sbin/arangod \

–database.directory /usr/local/arangodb/data/primary \

–javascript.startup-directory /usr/local/arangodb/share/arangodb3/js \

–javascript.app-path /usr/local/arangodb/var/lib/arangodb3-apps \

–server.endpoint tcp://dm9:18530 \

–server.authentication false

注意:

集群模式也可以只启用一个primary节点作为数据节点存放数据,但毕竟还是集群模式,而创建分片或者创建库和表时,有时候会受到限制,比如下面的命令在集群单个primary节点下,无法创建过大replication,因为存在分片。

curl -X POST –data-binary @- –dump – http://${ARANGO_HOST}:${ARANGO_PORT}/_db/${ARANGO_DB}/_api/collection <<EOF

{

“type” : “${FILE_EXT}”,

“name” : “${FILE_NAME}”,

“numberOfShards”: 12,

“replicationFactor”: 1,

“waitForSync”: true

}

EOF

8 Responses to “图数据库及图计算”

Leave a Reply