HugeGraph基本操作

HugeGraph是百度开源的一个图数据库，其插入性能比Neo4j优秀很多，在大数据处理方面性能卓越。

1. 基础知识

1.1 基本概念

1. 图（Graph）：指关系图。比如：同学及朋友关系图、银行转账图等。
2. 顶点（Vertex）：一般指实体。比如：人、账户等。
3. 边（Edge）：一般指顶点之间的关系。比如：朋友关系、转账动作等。
4. 属性（Property）：顶点或边可以包含属性，比如：人的姓名、人的年龄、转账的时间等。

1.2 元数据

在hugegraph中定义了四种基本的元数据:

PropertyKey：属性的类型
VertexLabel：顶点的类型
EdgeLabel：边的类型
IndexLabel：索引的类型

# 他们之间的关系
    IndexLabel 
      /      \
     /        \
    /       EdgeLabel
   /           /
VertexLabel   /
      /      /
     /      /
   PropertyKey

1.2.1 IndexLabel，索引类型

在hugeGraph中可以对顶点和边增加索引，提高图数据库的查询效率，其索引方式有三种：

search()，全文索引；
secondary()，二级索引，可以通过属性值来快算匹配顶点和边；
range()，范围索引，可以按照属性值的范围快速匹配顶点和边。

// 创建索引类型："personByName"，可以按“name”属性的值快速查询对应的“person”顶点 
schema.indexLabel("personByName") 
	.onV("person") .by("name") 
	.secondary() .create(); 
// 创建索引类型："personByAge"，可以按“age”属性的范围快速查询对应的“person”顶点
schema.indexLabel("personByAge")
	.onV("person")
	.by("age")
	.range()
	.create();
// 创建索引类型："knowsByWeight"，可以按“weight”属性的范围快速查询对应的“knows”边
schema.indexLabel("knowsByWeight")
	.onE("knows")
	.by("weight")
	.range()
	.ifNotExist()
	.create();

1.2.2 EdgeLabel，边类型

在hugegraph中边类型可以定义其name，sourceLabel，targetLabel，properties等等

// 创建边类型：人认识人"knows"，此类边由"person"指向"person" 
graph.schema().edgeLabel("knows")
	.sourceLabel("person")
	.targetLabel("person")
	.properties("weight")
	.create()
// 创建边类型：人创建软件"created"，此类边由"person"指向"software"
graph.schema().edgeLabel("created")
	.sourceLabel("person")
	.targetLabel("software")
	.properties("weight")
	.create()
// 创建边类型：软件包含软件"contains"，此类边由"software"指向"software"
graph.schema().edgeLabel("contains")
	.sourceLabel("software")
	.targetLabel("software")
	.properties("weight")
	.create()
// 创建边类型：软件定义语言"define"，此类边由"software"指向"language"
graph.schema().edgeLabel("define")
	.sourceLabel("software")
	.targetLabel("language")
	.properties("weight")
	.create()
// 创建边类型：软件实现软件"implements"，此类边由"software"指向"software"
graph.schema().edgeLabel("implements")
	.sourceLabel("software")
	.targetLabel("software")
	.properties("weight")
	.create()
// 创建边类型：软件支持语言"supports"，此类边由"software"指向"language"
graph.schema().edgeLabel("supports")
	.sourceLabel("software")
	.targetLabel("language")
	.properties("weight")
	.create()

1.2.3 VertexLabel，顶点类型

VertexLabel是顶点的类型，每个顶点都有对应的VertexLabel，一个VertexLabel可以有多个具体的顶点。VertexLabel可以定义一类顶点的类型名称、拥有的属性、ID策略、是否创建按类型的索引等。例如：

// 创建顶点类型：人"person"，包含姓名、年龄、地址等属性，使用自定义的字符串作为ID 
graph.schema()
    .vertexLabel("person")
    .properties("name", "age", "addr", "weight")
    .nullableKeys("addr", "weight")
    .useCustomizeStringId()
    .create()
// 创建顶点类型：软件"software"，包含名称、使用语言、标签等属性，使用名称作为主键
graph.schema()
    .vertexLabel("software")
    .properties("name", "lang", "tag", "weight")
    .primaryKeys("name")
    .create()
// 创建顶点类型：语言"language"，包含名称、使用语言等属性，使用名称作为主键 
graph.schema()
    .vertexLabel("language")
    .properties("name", "lang", "weight")
    .primaryKeys("name")
    .create()

以VertexLabel person为例：

vertexLabel("person")表示顶点类型的名字为person；
properties("name", "age","addr", "weight")表示person类型的顶点包含PropertyKeyname、age、addr和weight类型的属性；
nullableKeys("addr","weight")表示person类型的顶点可以不包含PropertyKey addr和weight类型的属性；
useCustomizeStringId()表示person类型的顶点使用指定的String类型的ID；
默认包含enableLabelIndex(true)，表示可以按类型查找person类型的顶点。

定义VertexLable可用的完整方法说明：

名字是字符串，vertexLabel(String)；
包含的属性，properties(String...)，必须是系统中已经创建过的PropertyKey的名字；
可空属性，nullableKeys(String...)，必须是properties的子集。

ID策略

useAutomaticId()，自动ID策略，该类型的每个顶点会在创建时由系统提供一个数字ID;
usePrimaryKeyId()，主键ID策略，该类型的顶点的ID是通过拼接primaryKeys(String...)中的多个属性的值组成;
useCustomizeStringId()，指定String ID策略，该类型的顶点在创建时使用指定的String作为顶点ID;
useCustomizeNumberId()，指定Number ID策略，该类型的顶点在创建时使用指定的Number作为顶点ID。

类型索引，enableLabelIndex(Boolean)，是否创建类型索引，如果创建了类型索引，就可以高效按类型查询顶点。

1.2.4 PropertyKey，属性类型

PropertyKey定义属性的类型，包括名字、类型、基数等。例如：

// 创建姓名属性，文本类型
graph.schema().propertyKey("name").asText().create()
// 创建年龄属性，整数类型 graph.schema().propertyKey("addr").asText().create()
// 创建地址属性，文本类型 graph.schema().propertyKey("lang").asText().create()
// 创建语言属性，文本类型 graph.schema().propertyKey("tag").asText().create()
// 创建标签属性，文本类型 graph.schema().propertyKey("weight").asFloat().create()
// 创建权重属性，浮点类型
graph.schema().propertyKey("age").asInt().create()

以PropertyKey name为例：

propertyKey("name")表示属性的名字为“name”
asText()表示属性的类型为文本
valueSingle()表示属性的基数为single，即单值类型

定义PropertyKey可用的完整方法说明：

名字是字符串，propertyKey(String)
类型包括：
asText()，字符串类型，是默认值
asInt()，整型
asDate()，日期类型
asUuid()，UUID类型
asBoolean()，布尔型
asByte()，字节型
asBlob()，字节数组型
asDouble()，双精度浮点型
asFloat()，单精度浮点型
asLong()，长整型
基数包括：
valueSingle()，值是单值类型，是默认值
valueList()，值是列表类型
valueSet()，值是集合类型

2.图数据

HugeGraph中的图数据包括：

顶点及其属性
边及其属性
索引数据

属性不可单独存在，必须依附于顶点或者边，索引数据用户不可见，属于系统内部数据，用于加速按属性查询。

2.1 vertex，顶点

Vertex是一个顶点，往往对应现实中的一个实体，比如一个人或者一本书等。

// 添加3个作者顶点
javeme = graph.addVertex(T.label, "person", T.id, "javeme", "name", "Jermy Li", "age", 29, "addr", "Beijing", "weight", 1)
zhoney = graph.addVertex(T.label, "person", T.id, "zhoney", "name", "Zhoney Zhang", "age", 29, "addr", "Beijing", "weight", 1)
linary = graph.addVertex(T.label, "person", T.id, "linary", "name", "Linary Li", "age", 28, "addr", "Wuhan. Hubei", "weight", 1)

// 添加HugeGraph顶点
hugegraph = graph.addVertex(T.label, "software", "name", "HugeGraph", "lang", "java", "tag", "Graph Database", "weight", 1)

以顶点javeme为例说明：

Label是VertexLabel person
ID是javeme，VertexLabel person的ID策略是useCustomizeStringId()，所以可以使用指定的字符串”javeme”作为顶点ID
包含的属性及其值是{“name”: “Jermy Li”, “age”: 29, “addr”: “Beijing”, “weight”: 1}

Vertex有三部分组成：

Label，顶点的类型，即某个VertexLabel
ID，每个顶点的唯一标识，ID的类型根据顶点的VertexLabel的ID策略决定（可参见VertexLabel部分）
Properties，顶点的属性，属性的数目和种类由VertexLabel限定，包括哪些可以为空等；属性的值由PropertyKey限定

由于Vertex依赖于其所属的VertexLabel，所以创建Vertex之前，必须保证对应的VertexLabel已经创建完成

2.1 Edge，顶点

Edge是一条边，往往代表现实中的一种关系或者动作，比如包含、属于或者阅读等。

// 添加作者创建HugeGraph的边
javeme.addEdge("created", hugegraph, "weight", 1)
zhoney.addEdge("created", hugegraph, "weight", 1)
linary.addEdge("created", hugegraph, "weight", 1)

// 添加作者之间的关系边
javeme.addEdge("knows", zhoney, "weight", 1)
javeme.addEdge("knows", linary, "weight", 1)

// 添加HugeGraph实现TinkerPop的边
hugegraph.addEdge("implements", tinkerpop, "weight", 1)
// 添加HugeGraph支持Gremlin的边
hugegraph.addEdge("supports", gremlin, "weight", 1)

以边javeme>created>>hugegraph为例（上述例子中的第一条边）：

边的出发顶点source vertex，VertexLabel person类型的顶点javeme；
边的目标顶点target vertex，VertexLabel softwareenter code here类型的顶点hugegraph；
边的类型label，EdgeLabel created 边的属性properties， {“weight”: 1}。

边有五部分组成：

Label，边的类型，即某个EdgeLabel
Source Vertex，边的源顶点或者出发顶点，必须是EdgeLabel的sourceLabel(String)所指定的VertexLabel类型的顶点
Target Vertex，边的目标顶点或者达到顶点，必须是EdgeLabel的targetLabel(String)所指定的VertexLabel类型的顶点
Properties，边的属性，属性的数目和种类由EdgeLabel限定，包括哪些属性可以为空；属性的值由属性对应的PropertyKey限定
Id，边的唯一标识，边的ID是由Source Vertex、Target Vertex、Label和sortKeys(如果有)拼接而成
- EdgeLabel为singleTime()时，ID格式为sourceVertexId>label>>targetLabelId
- EdgeLabel为multiTimes()时，ID格式为sourceVertexId>label>sortKeys>targetLabelId