admin管理员组文章数量:1022989
I am using the following code to add documents to a Lucene index. I have indexed 23,425 documents, but the folder where the index is stored has a size of 447.4 MB. In contrast, when storing the same data in a Parquet file with the same 23,425 records, the file size is only 625 KB. The folder size for the Lucene index seems excessively large. Could someone help identify why this is happening and how to optimize it? Below is the code I am using:
MMapDirectory indexDirectory = new MMapDirectory(Paths.get(directory));
// Configure the IndexWriter with an analyzer
StandardAnalyzer analyzer = new StandardAnalyzer();
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter indexWriter = new IndexWriter(indexDirectory, config);
for (Map.Entry<String, OperationAggregation> entry : operations.entrySet())
{
Document doc1 = new Document();
doc1.add(new StringField("namespace", namespace, Store.YES));
doc1.add(new StringField("type", "operations", Store.YES));
doc1.add(new StringField("data", entry.getKey(), Store.YES));
doc1.add(new StringField("serviceName",entry.getValue().getServiceName(),
Store.YES));
List<AggregationAttribute> attributes =
entry.getValue().getOperationAttributes();
for (int i = 0; i < attributes.size(); i++)
{
doc1.add(new StoredField(attributes.get(i).getName(),
String.valueOf(attributes.get(i).getValue())));
}
try { docCount.getAndIncrement();
ndexWriter.addDocument(doc1);
} catch (IOException e) {
logger.error("Error while adding document to index", e);
}
}
indexWritermit();
indexWriter.close();
I am using the following code to add documents to a Lucene index. I have indexed 23,425 documents, but the folder where the index is stored has a size of 447.4 MB. In contrast, when storing the same data in a Parquet file with the same 23,425 records, the file size is only 625 KB. The folder size for the Lucene index seems excessively large. Could someone help identify why this is happening and how to optimize it? Below is the code I am using:
MMapDirectory indexDirectory = new MMapDirectory(Paths.get(directory));
// Configure the IndexWriter with an analyzer
StandardAnalyzer analyzer = new StandardAnalyzer();
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter indexWriter = new IndexWriter(indexDirectory, config);
for (Map.Entry<String, OperationAggregation> entry : operations.entrySet())
{
Document doc1 = new Document();
doc1.add(new StringField("namespace", namespace, Store.YES));
doc1.add(new StringField("type", "operations", Store.YES));
doc1.add(new StringField("data", entry.getKey(), Store.YES));
doc1.add(new StringField("serviceName",entry.getValue().getServiceName(),
Store.YES));
List<AggregationAttribute> attributes =
entry.getValue().getOperationAttributes();
for (int i = 0; i < attributes.size(); i++)
{
doc1.add(new StoredField(attributes.get(i).getName(),
String.valueOf(attributes.get(i).getValue())));
}
try { docCount.getAndIncrement();
ndexWriter.addDocument(doc1);
} catch (IOException e) {
logger.error("Error while adding document to index", e);
}
}
indexWritermit();
indexWriter.close();
本文标签: javaFolder Size is too Large of Lucene DocumentsStack Overflow
版权声明:本文标题:java - Folder Size is too Large of Lucene Documents - Stack Overflow 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://it.en369.cn/questions/1745559802a2156093.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论