admin管理员组文章数量:1023519
I have written a program which basically reads a file, processes its records to extract some data using a third party library and then dispatches the data to a remote server.
To speed up the process, I am creating Callable tasks that takes extracted data, and performs data dispatch step, and finally using executor service to execute tasks. Third party library I am using seems to not work well in multi threaded environment, so I have not put the data extraction step in Callable task.
Pseudo-code is as:
Iterator<Record> records = .....
List<Records> batch = ....
Data extractedData = ....
List<MyTask> tasks = ....
while(iterator.hasNext()) {
Record record = iterator.next();
batch.add(record);
extractedData.add(extractDataUsing3rdPartyLibrary(record));
if(batch.size == BATCH_SIZE) {
MyTask task = new MyTask (extractedData,....);
tasks.add(task);
extractedData.clear();
}
}
executeTasks(execuytorService, tasks);
....
....
MyTask implements Callable<Integer> {
public Integer call() throws Exception {
// dispatch extractedData
// clear extractedData;
}
}
But problem is that data extracted is memory heavy and as a result I am frequently facing out-of-memory issues.
I am thinking of an approach wherein I would check size of tasks intermittently, and if it exceeds certain threshold, I would process tasks created so far, followed by data clean up, and repeating the process.
I want to know if this is a good approach? And if yes, then what would be a good way of finding in-memory size of objects in java, as C++ like sizeof method is not available? Instrumentation API is something I came across, but it needs setting up agents.
I have written a program which basically reads a file, processes its records to extract some data using a third party library and then dispatches the data to a remote server.
To speed up the process, I am creating Callable tasks that takes extracted data, and performs data dispatch step, and finally using executor service to execute tasks. Third party library I am using seems to not work well in multi threaded environment, so I have not put the data extraction step in Callable task.
Pseudo-code is as:
Iterator<Record> records = .....
List<Records> batch = ....
Data extractedData = ....
List<MyTask> tasks = ....
while(iterator.hasNext()) {
Record record = iterator.next();
batch.add(record);
extractedData.add(extractDataUsing3rdPartyLibrary(record));
if(batch.size == BATCH_SIZE) {
MyTask task = new MyTask (extractedData,....);
tasks.add(task);
extractedData.clear();
}
}
executeTasks(execuytorService, tasks);
....
....
MyTask implements Callable<Integer> {
public Integer call() throws Exception {
// dispatch extractedData
// clear extractedData;
}
}
But problem is that data extracted is memory heavy and as a result I am frequently facing out-of-memory issues.
I am thinking of an approach wherein I would check size of tasks intermittently, and if it exceeds certain threshold, I would process tasks created so far, followed by data clean up, and repeating the process.
I want to know if this is a good approach? And if yes, then what would be a good way of finding in-memory size of objects in java, as C++ like sizeof method is not available? Instrumentation API is something I came across, but it needs setting up agents.
本文标签: javaAn approach to avoid OOM issue for parallel tasksStack Overflow
版权声明:本文标题:java - An approach to avoid OOM issue for parallel tasks - Stack Overflow 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://it.en369.cn/questions/1745581531a2157337.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论