网站首页 > 文章精选 正文
序
本文主要研究一下Spring AI的PgVectorStore
示例
pom.xml
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-vector-store-pgvector</artifactId>
</dependency>
pgvector
docker run -it --rm --name postgres -p 5432:5432 -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=postgres pgvector/pgvector:pg16
配置
spring:
datasource:
name: pgvector
driverClassName: org.postgresql.Driver
url: jdbc:postgresql://localhost:5432/postgres?currentSchema=public&connectTimeout=60&socketTimeout=60
username: postgres
password: postgres
ai:
vectorstore:
type: pgvector
pgvector:
initialize-schema: true
index-type: HNSW
distance-type: COSINE_DISTANCE
dimensions: 1024
max-document-batch-size: 10000
schema-name: public
table-name: vector_store
设置initialize-schema为true,默认会执行如下初始化脚本:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS hstore;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TABLE IF NOT EXISTS vector_store (
id uuid DEFAULT uuid_generate_v4() PRIMARY KEY,
content text,
metadata json,
embedding vector(1536) // 1536 is the default embedding dimension
);
CREATE INDEX ON vector_store USING HNSW (embedding vector_cosine_ops);
脚本源码:
org/springframework/ai/vectorstore/pgvector/PgVectorStore.java
public void afterPropertiesSet() {
logger.info("Initializing PGVectorStore schema for table: {} in schema: {}", this.getVectorTableName(),
this.getSchemaName());
logger.info("vectorTableValidationsEnabled {}", this.schemaValidation);
if (this.schemaValidation) {
this.schemaValidator.validateTableSchema(this.getSchemaName(), this.getVectorTableName());
}
if (!this.initializeSchema) {
logger.debug("Skipping the schema initialization for the table: {}", this.getFullyQualifiedTableName());
return;
}
// Enable the PGVector, JSONB and UUID support.
this.jdbcTemplate.execute("CREATE EXTENSION IF NOT EXISTS vector");
this.jdbcTemplate.execute("CREATE EXTENSION IF NOT EXISTS hstore");
if (this.idType == PgIdType.UUID) {
this.jdbcTemplate.execute("CREATE EXTENSION IF NOT EXISTS \"uuid-ossp\"");
}
this.jdbcTemplate.execute(String.format("CREATE SCHEMA IF NOT EXISTS %s", this.getSchemaName()));
// Remove existing VectorStoreTable
if (this.removeExistingVectorStoreTable) {
this.jdbcTemplate.execute(String.format("DROP TABLE IF EXISTS %s", this.getFullyQualifiedTableName()));
}
this.jdbcTemplate.execute(String.format("""
CREATE TABLE IF NOT EXISTS %s (
id %s PRIMARY KEY,
content text,
metadata json,
embedding vector(%d)
)
""", this.getFullyQualifiedTableName(), this.getColumnTypeName(), this.embeddingDimensions()));
if (this.createIndexMethod != PgIndexType.NONE) {
this.jdbcTemplate.execute(String.format("""
CREATE INDEX IF NOT EXISTS %s ON %s USING %s (embedding %s)
""", this.getVectorIndexName(), this.getFullyQualifiedTableName(), this.createIndexMethod,
this.getDistanceType().index));
}
}
代码
@Test
public void testAddAndSearch() {
List<Document> documents = List.of(
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
new Document("The World is Big and Salvation Lurks Around the Corner"),
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
// Add the documents to Milvus Vector Store
pgVectorStore.add(documents);
// Retrieve documents similar to a query
List<Document> results = this.pgVectorStore.similaritySearch(SearchRequest.builder().query("Spring").topK(5).build());
log.info("results:{}", JSON.toJSONString(results));
}
输出如下:
results:[{"contentFormatter":{"excludedEmbedMetadataKeys":[],"excludedInferenceMetadataKeys":[],"metadataSeparator":"\n","metadataTemplate":"{key}: {value}","textTemplate":"{metadata_string}\n\n{content}"},"formattedContent":"distance: 0.43509135\nmeta1: meta1\n\nSpring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!","id":"9dbce9af-0451-4bdb-8f03-1f8b8c4d696f","metadata":{"distance":0.43509135,"meta1":"meta1"},"score":0.5649086534976959,"text":"Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!"},{"contentFormatter":{"$ref":"$[0].contentFormatter"},"formattedContent":"distance: 0.57093126\n\nThe World is Big and Salvation Lurks Around the Corner","id":"92a45683-11fc-48b7-8676-dcca3b518dd4","metadata":{"distance":0.57093126},"score":0.42906874418258667,"text":"The World is Big and Salvation Lurks Around the Corner"},{"contentFormatter":{"$ref":"$[0].contentFormatter"},"formattedContent":"distance: 0.5936024\nmeta2: meta2\n\nYou walk forward facing the past and you turn back toward the future.","id":"298f6565-bcc7-4cbc-8552-4c0e2d021dbf","metadata":{"distance":0.5936024,"meta2":"meta2"},"score":0.40639758110046387,"text":"You walk forward facing the past and you turn back toward the future."}]
源码
PgVectorStoreAutoConfiguration
org/springframework/ai/vectorstore/pgvector/autoconfigure/PgVectorStoreAutoConfiguration.java
@AutoConfiguration(after = JdbcTemplateAutoConfiguration.class)
@ConditionalOnClass({ PgVectorStore.class, DataSource.class, JdbcTemplate.class })
@EnableConfigurationProperties(PgVectorStoreProperties.class)
@ConditionalOnProperty(name = SpringAIVectorStoreTypes.TYPE, havingValue = SpringAIVectorStoreTypes.PGVECTOR,
matchIfMissing = true)
public class PgVectorStoreAutoConfiguration {
@Bean
@ConditionalOnMissingBean(BatchingStrategy.class)
BatchingStrategy pgVectorStoreBatchingStrategy() {
return new TokenCountBatchingStrategy();
}
@Bean
@ConditionalOnMissingBean
public PgVectorStore vectorStore(JdbcTemplate jdbcTemplate, EmbeddingModel embeddingModel,
PgVectorStoreProperties properties, ObjectProvider<ObservationRegistry> observationRegistry,
ObjectProvider<VectorStoreObservationConvention> customObservationConvention,
BatchingStrategy batchingStrategy) {
var initializeSchema = properties.isInitializeSchema();
return PgVectorStore.builder(jdbcTemplate, embeddingModel)
.schemaName(properties.getSchemaName())
.idType(properties.getIdType())
.vectorTableName(properties.getTableName())
.vectorTableValidationsEnabled(properties.isSchemaValidation())
.dimensions(properties.getDimensions())
.distanceType(properties.getDistanceType())
.removeExistingVectorStoreTable(properties.isRemoveExistingVectorStoreTable())
.indexType(properties.getIndexType())
.initializeSchema(initializeSchema)
.observationRegistry(observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP))
.customObservationConvention(customObservationConvention.getIfAvailable(() -> null))
.batchingStrategy(batchingStrategy)
.maxDocumentBatchSize(properties.getMaxDocumentBatchSize())
.build();
}
}
PgVectorStoreAutoConfiguration在
spring.ai.vectorstore.type为pgvector时会自动装配PgVectorStore,它依赖PgVectorStoreProperties及
JdbcTemplateAutoConfiguration
PgVectorStoreProperties
org/springframework/ai/vectorstore/pgvector/autoconfigure/PgVectorStoreProperties.java
@ConfigurationProperties(PgVectorStoreProperties.CONFIG_PREFIX)
public class PgVectorStoreProperties extends CommonVectorStoreProperties {
public static final String CONFIG_PREFIX = "spring.ai.vectorstore.pgvector";
private int dimensions = PgVectorStore.INVALID_EMBEDDING_DIMENSION;
private PgIndexType indexType = PgIndexType.HNSW;
private PgDistanceType distanceType = PgDistanceType.COSINE_DISTANCE;
private boolean removeExistingVectorStoreTable = false;
// Dynamically generate table name in PgVectorStore to allow backward compatibility
private String tableName = PgVectorStore.DEFAULT_TABLE_NAME;
private String schemaName = PgVectorStore.DEFAULT_SCHEMA_NAME;
private PgVectorStore.PgIdType idType = PgVectorStore.PgIdType.UUID;
private boolean schemaValidation = PgVectorStore.DEFAULT_SCHEMA_VALIDATION;
private int maxDocumentBatchSize = PgVectorStore.MAX_DOCUMENT_BATCH_SIZE;
//......
}
PgVectorStoreProperties继承了
CommonVectorStoreProperties的initializeSchema配置,它提供了
spring.ai.vectorstore.pgvector的配置,主要有dimensions、indexType、distanceType、
removeExistingVectorStoreTable、tableName、schemaName、idType、schemaValidation、maxDocumentBatchSize这几个属性
JdbcTemplateAutoConfiguration
org/springframework/boot/autoconfigure/jdbc/JdbcTemplateAutoConfiguration.java
@AutoConfiguration(after = DataSourceAutoConfiguration.class)
@ConditionalOnClass({ DataSource.class, JdbcTemplate.class })
@ConditionalOnSingleCandidate(DataSource.class)
@EnableConfigurationProperties(JdbcProperties.class)
@Import({ DatabaseInitializationDependencyConfigurer.class, JdbcTemplateConfiguration.class,
NamedParameterJdbcTemplateConfiguration.class })
public class JdbcTemplateAutoConfiguration {
}
JdbcTemplateAutoConfiguration引入了
DatabaseInitializationDependencyConfigurer、JdbcTemplateConfiguration、
NamedParameterJdbcTemplateConfiguration
小结
Spring AI提供了
spring-ai-starter-vector-store-pgvector用于自动装配PgVectorStore。除了
spring.ai.vectorstore.pgvector的配置,还需要配置spring.datasource。
doc
- vectordbs/pgvector
猜你喜欢
- 2025-05-07 自定义代码生成器(上)(自动代码生成器下载)
- 2025-05-07 MySQL中的存储过程和函数(mysql存储过程与函数)
- 2025-05-07 Instagram架构的分片和ID的设计(ins的分类)
- 2025-05-07 对PostgreSQL中权限的理解(初学者必读)
- 2025-05-07 一文看懂MySQL如何判断InnoDB表是独立表空间还是共享表空间
- 2025-05-07 ArcGIS Pro遥感影像的监督分类(arcgis遥感影像处理教程)
- 2025-05-07 MySQL学到什么程度?才有可以在简历上写精通
- 2025-05-07 大数据时代:Apache Phoenix 的优雅操作实践
- 2025-05-07 go语言database/sql标准库(go语言gui库)
- 2025-05-07 centos7系统下postgresql15离线安装,卸载
- 06-18技术分享 | Web自动化之Selenium安装
- 06-18postman系列之批量执行接口测试用例
- 06-18Junit5 架构、新特性及基本使用(常用注解与套件执行)
- 06-18「技术分享」postman完整的接口测试
- 06-18HTTP接口测试工具Postman(接口测试url)
- 06-18postman--实现接口自动化测试(postman接口自动化框架)
- 06-18讲解LDO(讲解的近义词)
- 06-18震撼!2020国际摄影奖获奖佳作欣赏
- 最近发表
- 标签列表
-
- newcoder (56)
- 字符串的长度是指 (45)
- drawcontours()参数说明 (60)
- unsignedshortint (59)
- postman并发请求 (47)
- python列表删除 (50)
- 左程云什么水平 (56)
- 计算机网络的拓扑结构是指() (45)
- 编程题 (64)
- postgresql默认端口 (66)
- 数据库的概念模型独立于 (48)
- 产生系统死锁的原因可能是由于 (51)
- 数据库中只存放视图的 (62)
- 在vi中退出不保存的命令是 (53)
- 哪个命令可以将普通用户转换成超级用户 (49)
- noscript标签的作用 (48)
- 联合利华网申 (49)
- swagger和postman (46)
- 结构化程序设计主要强调 (53)
- 172.1 (57)
- apipostwebsocket (47)
- 唯品会后台 (61)
- 简历助手 (56)
- offshow (61)
- mysql数据库面试题 (57)