? 本文將研究如何使用Hibernate/JPA進(jìn)行批量插入或更新實(shí)體威根。批量處理使我們可以在單個網(wǎng)絡(luò)調(diào)用中向數(shù)據(jù)庫發(fā)送一組SQL語句。這樣嘴脾,可以優(yōu)化應(yīng)用程序的網(wǎng)絡(luò)和內(nèi)存使用率男摧。
1、創(chuàng)建實(shí)體
? 首先译打,創(chuàng)建一個School實(shí)體:
@Entity
@Data
public class School {
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE)
private long id;
private String name;
@OneToMany(mappedBy = "school")
private List<Student> students;
}
每所school至少有零個student:
@Entity
@Data
public class Student {
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE)
private long id;
private String name;
@ManyToOne
private School school;
}
2耗拓、跟蹤SQL查詢
? 在運(yùn)行示例時,我們需要驗(yàn)證插入/更新語句確實(shí)是批量發(fā)送的奏司。無奈的是乔询,我們無法從Hibernate日志語句中了解SQL語句是否已批處理。因此韵洋,我們將使用數(shù)據(jù)源代理來跟蹤Hibernate / JPA SQL語句:
private static class ProxyDataSourceInterceptor implements MethodInterceptor {
private final DataSource dataSource;
public ProxyDataSourceInterceptor(final DataSource dataSource) {
this.dataSource = ProxyDataSourceBuilder.create(dataSource)
.name("Batch-Insert-Logger")
.asJson().countQuery().logQueryToSysOut().build();
}
// Other methods...
}
3竿刁、默認(rèn)行為
? Hibernate默認(rèn)情況下不啟用批處理。這意味著它將為每個插入/更新操作發(fā)送單獨(dú)的SQL語句:
@Transactional
@Test
public void whenNotConfigured_ThenSendsInsertsSeparately() {
for (int i = 0; i < 10; i++) {
School school = createSchool(i);
entityManager.persist(school);
}
entityManager.flush();
}
? 在這里搪缨,persist了10個School實(shí)體食拜。如果查看查詢?nèi)罩荆梢钥吹紿ibernate分別發(fā)送每個插入語句:
"querySize":1, "batchSize":0, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School1","1"]]
"querySize":1, "batchSize":0, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School2","2"]]
"querySize":1, "batchSize":0, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School3","3"]]
"querySize":1, "batchSize":0, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School4","4"]]
"querySize":1, "batchSize":0, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School5","5"]]
"querySize":1, "batchSize":0, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School6","6"]]
"querySize":1, "batchSize":0, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School7","7"]]
"querySize":1, "batchSize":0, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School8","8"]]
"querySize":1, "batchSize":0, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School9","9"]]
"querySize":1, "batchSize":0, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School10","10"]]
? 因此副编,我們應(yīng)該配置Hibernate以啟用批處理负甸。為此,我們應(yīng)該將hibernate.jdbc.batch_size屬性設(shè)置為大于0的數(shù)字痹届。如果我們手動創(chuàng)建EntityManager呻待,則應(yīng)將hibernate.jdbc.batch_size添加到Hibernate屬性中:
public Properties hibernateProperties() {
Properties properties = new Properties();
properties.put("hibernate.jdbc.batch_size", "5");
// Other properties...
return properties;
}
? 如果使用的是Spring Boot,則可以將其定義為應(yīng)用程序?qū)傩裕?/p>
spring.jpa.properties.hibernate.jdbc.batch_size=5
4队腐、批量插入單個表
4.1蚕捉、批量插入,無顯式刷新
? 首先香到,看一下在僅處理一種實(shí)體類型時如何使用批處理插入鱼冀。使用先前的代碼示例报破,但是這次啟用了批處理:
@Transactional
@Test
public void whenInsertingSingleTypeOfEntity_thenCreatesSingleBatch() {
for (int i = 0; i < 10; i++) {
School school = createSchool(i);
entityManager.persist(school);
}
}
? 在這里,persist了10個School實(shí)體千绪。當(dāng)查看日志時充易,我們可以驗(yàn)證Hibernate是否批量發(fā)送insert語句:
"batch":true, "querySize":1, "batchSize":5, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School1","1"],["School2","2"],["School3","3"],["School4","4"],["School5","5"]]
"batch":true, "querySize":1, "batchSize":5, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School6","6"],["School7","7"],["School8","8"],["School9","9"],["School10","10"]]
? 這里要提到的重要一件事是內(nèi)存消耗。當(dāng)我們持久化一個實(shí)體時荸型,Hibernate將其存儲在持久化上下文中盹靴。例如,如果我們在一個事務(wù)中保留100,000個實(shí)體瑞妇,則最終將在內(nèi)存中擁有100,000個實(shí)體實(shí)例稿静,可能會導(dǎo)致OutOfMemoryException。
4.2辕狰、批量插入與顯式刷新
? 現(xiàn)在改备,我們將研究如何在批處理操作期間優(yōu)化內(nèi)存使用。讓我們深入研究持久性上下文的作用蔓倍。
? 首先悬钳,持久性上下文將新創(chuàng)建的實(shí)體以及修改后的實(shí)體存儲在內(nèi)存中。同步事務(wù)后偶翅,Hibernate將這些更改發(fā)送到數(shù)據(jù)庫默勾。這通常發(fā)生在交易結(jié)束時。但是聚谁,調(diào)用EntityManager.flush()也會觸發(fā)事務(wù)同步母剥。
? 其次,持久性上下文用作實(shí)體緩存形导,因此也稱為第一級緩存环疼。要在持久性上下文中清除實(shí)體,我們可以調(diào)用EntityManager.clear()骤宣。
? 因此秦爆,為了減少批處理期間的內(nèi)存負(fù)載,只要達(dá)到批處理大小憔披,我們就可以在應(yīng)用程序代碼上調(diào)用EntityManager.flush()和EntityManager.clear():
@Transactional
@Test
public void whenFlushingAfterBatch_ThenClearsMemory() {
for (int i = 0; i < 10; i++) {
if (i > 0 && i % BATCH_SIZE == 0) {
entityManager.flush();
entityManager.clear();
}
School school = createSchool(i);
entityManager.persist(school);
}
}
? 在這里等限,我們在持久性上下文中刷新實(shí)體,從而使Hibernate將查詢發(fā)送到數(shù)據(jù)庫芬膝。此外望门,通過清除持久性上下文,我們從內(nèi)存中刪除了School實(shí)體锰霜。批處理行為將保持不變
5筹误、批量插入多個表
? 現(xiàn)在讓我們看看在一個事務(wù)中處理多種實(shí)體類型時如何配置批處理插入。
? 當(dāng)我們要保留幾種類型的實(shí)體時癣缅,Hibernate為每種實(shí)體類型創(chuàng)建一個不同的批處理厨剪。這是因?yàn)?strong>在同一批中只能有一種類型的實(shí)體哄酝。
? 此外,由于Hibernate收集插入語句祷膳,因此每當(dāng)遇到與當(dāng)前批處理中不同的實(shí)體類型時陶衅,它將創(chuàng)建一個新批處理。即使已經(jīng)有該實(shí)體類型的批次直晨,也是如此:
@Transactional
@Test
public void whenThereAreMultipleEntities_ThenCreatesNewBatch() {
for (int i = 0; i < 10; i++) {
if (i > 0 && i % BATCH_SIZE == 0) {
entityManager.flush();
entityManager.clear();
}
School school = createSchool(i);
entityManager.persist(school);
Student firstStudent = createStudent(school);
Student secondStudent = createStudent(school);
entityManager.persist(firstStudent);
entityManager.persist(secondStudent);
}
}
? 在這里搀军,我們要插入School并將其分配給兩個Student,然后重復(fù)此過程10次勇皇。
? 在日志中罩句,我們看到Hibernate 以幾批大小為1的方式發(fā)送School插入語句,而我們原本只希望收到2批大小為5的數(shù)據(jù)敛摘。此外门烂,Student插入語句也以幾批大小為2的方式發(fā)送,而不是4批大小為5的方式發(fā)送兄淫。 :
"batch":true, "querySize":1, "batchSize":1, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School1","1"]]
"batch":true, "querySize":1, "batchSize":2, "query":["insert into student (name, school_id, id)
values (?, ?, ?)"], "params":[["Student-School1","1","2"],["Student-School1","1","3"]]
"batch":true, "querySize":1, "batchSize":1, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School2","4"]]
"batch":true, "querySize":1, "batchSize":2, "query":["insert into student (name, school_id, id)
values (?, ?, ?)"], "params":[["Student-School2","4","5"],["Student-School2","4","6"]]
"batch":true, "querySize":1, "batchSize":1, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School3","7"]]
"batch":true, "querySize":1, "batchSize":2, "query":["insert into student (name, school_id, id)
values (?, ?, ?)"], "params":[["Student-School3","7","8"],["Student-School3","7","9"]]
Other log lines...
? 要批處理具有相同實(shí)體類型的所有插入語句诅福,我們應(yīng)該配置hibernate.order_inserts屬性。
我們可以使用EntityManagerFactory手動配置Hibernate屬性:
public Properties hibernateProperties() {
Properties properties = new Properties();
properties.put("hibernate.order_inserts", "true");
// Other properties...
return properties;
}
? 如果使用的是Spring Boot拖叙,則可以在application.properties中配置屬性:
spring.jpa.properties.hibernate.order_inserts=true
? 添加此屬性后,我們將有1批用于School插入和2批針對Student插入:
"batch":true, "querySize":1, "batchSize":5, "query":["insert into school (name, id) values (?, ?)"],
"params":[["School6","16"],["School7","19"],["School8","22"],["School9","25"],["School10","28"]]
"batch":true, "querySize":1, "batchSize":5, "query":["insert into student (name, school_id, id)
values (?, ?, ?)"], "params":[["Student-School6","16","17"],["Student-School6","16","18"],
["Student-School7","19","20"],["Student-School7","19","21"],["Student-School8","22","23"]]
"batch":true, "querySize":1, "batchSize":5, "query":["insert into student (name, school_id, id)
values (?, ?, ?)"], "params":[["Student-School8","22","24"],["Student-School9","25","26"],
["Student-School9","25","27"],["Student-School10","28","29"],["Student-School10","28","30"]]
6赂乐、批量更新
? 現(xiàn)在薯鳍,讓我們繼續(xù)進(jìn)行批處理更新。與批處理插入類似挨措,我們可以對多個更新語句進(jìn)行分組挖滤,然后一次性將它們發(fā)送到數(shù)據(jù)庫。
? 為此浅役,我們將配置hibernate.order_updates和hibernate.jdbc.batch_versioned_data屬性斩松。如果我們手動創(chuàng)建EntityManagerFactory,則可以通過編程方式設(shè)置屬性:
public Properties hibernateProperties() {
Properties properties = new Properties();
properties.put("hibernate.order_updates", "true");
properties.put("hibernate.batch_versioned_data", "true");
// Other properties...
return properties;
}
? 如果使用Spring Boot觉既,則將它們添加到application.properties中:
spring.jpa.properties.hibernate.order_updates=true
spring.jpa.properties.hibernate.batch_versioned_data=true
? 配置完這些屬性后惧盹,Hibernate應(yīng)該將更新語句分批分組:
@Transactional
@Test
public void whenUpdatingEntities_thenCreatesBatch() {
TypedQuery<School> schoolQuery =
entityManager.createQuery("SELECT s from School s", School.class);
List<School> allSchools = schoolQuery.getResultList();
for (School school : allSchools) {
school.setName("Updated_" + school.getName());
}
}
? 在這里,我們更新了學(xué)校實(shí)體瞪讼,并且Hibernate分2批發(fā)送了大小為5的SQL語句:
"batch"``:``true``, ``"querySize"``:1, ``"batchSize"``:5, ``"query"``:[``"update school set name=? where id=?"``],
``"params"``:[[``"Updated_School1"``,``"1"``],[``"Updated_School2"``,``"2"``],[``"Updated_School3"``,``"3"``],
``[``"Updated_School4"``,``"4"``],[``"Updated_School5"``,``"5"``]]
"batch"``:``true``, ``"querySize"``:1, ``"batchSize"``:5, ``"query"``:[``"update school set name=? where id=?"``],
``"params"``:[[``"Updated_School6"``,``"6"``],[``"Updated_School7"``,``"7"``],[``"Updated_School8"``,``"8"``],
``[``"Updated_School9"``,``"9"``],[``"Updated_School10"``,``"10"``]]
7钧椰、@Id生成策略
當(dāng)我們想使用批處理進(jìn)行插入/更新時,我們應(yīng)該了解主鍵生成策略符欠。如果我們的實(shí)體使用GenerationType.IDENTITY標(biāo)識符生成器嫡霞,則Hibernate將靜默禁用批處理插入/更新。
由于示例中的實(shí)體使用GenerationType.SEQUENCE標(biāo)識符生成器希柿,因此Hibernate啟用了批處理操作:
@Id
@GeneratedValue (strategy = GenerationType.SEQUENCE)
private long id;