Name: Java Wordcount
Contact Person: support-compss@bsc.es
Access Level: public
License Agreement: Apache2
Platform: COMPSs
Description
Wordcount application. There are two versions of Wordcount, depending on how the input data is given.
Version 1
''Single input file'', where all the text is given in the same file and the chunks are calculated with a BLOCK_SIZE parameter.
Version 2
''Multiple input files'', where the text fragments are already in different files under the same directory
Execution instructions
Usage:
runcompss --classpath=application_sources/jar/wordcount.jar wordcount.multipleFiles.Wordcount DATA_FOLDER
runcompss --classpath=application_sources/jar/wordcount.jar wordcount.uniqueFile.Wordcount DATA_FILE BLOCK_SIZE
where:
- DATA_FOLDER: Absolute path to the base folder of the dataset files
- DATA_FILE: Absolute path to the dabase file
- BLOCK_SIZE: Number of bytes of each block
Execution Examples
runcompss --classpath=application_sources/jar/wordcount.jar wordcount.multipleFiles.Wordcount dataset/data-set/
runcompss --classpath=application_sources/jar/wordcount.jar wordcount.uniqueFile.Wordcount dataset/data-set/file_small.txt 650
runcompss --classpath=application_sources/jar/wordcount.jar wordcount.uniqueFile.Wordcount dataset/data-set/file_long.txt 250000
Build
Option 1: Native java
cd application_sources/; javac src/main/java/wordcount/*.java
cd src/main/java/; jar cf wordcount.jar wordcount/
cd ../../../; mv src/main/java/wordcount.jar jar/
Option 2: Maven
cd application_sources/
mvn clean package
Click and drag the diagram to pan, double click or use the controls to zoom.
Version History
COMPSs 3.3 (earliest) Created 4th Dec 2023 at 14:22 by Raül Sirvent
Run with COMPSs 3.3 version
Frozen
COMPSs-3.3
3f5738b
Creator
Additional credit
The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)
Submitter
Views: 1461 Downloads: 184
Created: 4th Dec 2023 at 14:22
Last updated: 4th Dec 2023 at 14:25
None