Java COMPSs wordcount example (laptop run, files used as inputs)
COMPSs 3.3

Workflow Type: COMPSs
Stable

Name: Java Wordcount
Contact Person: support-compss@bsc.es
Access Level: public
License Agreement: Apache2
Platform: COMPSs

Description

Wordcount application. There are two versions of Wordcount, depending on how the input data is given.

Version 1

''Single input file'', where all the text is given in the same file and the chunks are calculated with a BLOCK_SIZE parameter.

Version 2

''Multiple input files'', where the text fragments are already in different files under the same directory

Execution instructions

Usage:

runcompss --classpath=application_sources/jar/wordcount.jar wordcount.multipleFiles.Wordcount DATA_FOLDER
runcompss --classpath=application_sources/jar/wordcount.jar wordcount.uniqueFile.Wordcount DATA_FILE BLOCK_SIZE

where:

  • DATA_FOLDER: Absolute path to the base folder of the dataset files
  • DATA_FILE: Absolute path to the dabase file
  • BLOCK_SIZE: Number of bytes of each block

Execution Examples

runcompss --classpath=application_sources/jar/wordcount.jar wordcount.multipleFiles.Wordcount dataset/data-set/
runcompss --classpath=application_sources/jar/wordcount.jar wordcount.uniqueFile.Wordcount dataset/data-set/file_small.txt 650
runcompss --classpath=application_sources/jar/wordcount.jar wordcount.uniqueFile.Wordcount dataset/data-set/file_long.txt 250000

Build

Option 1: Native java

cd application_sources/; javac src/main/java/wordcount/*.java
cd src/main/java/; jar cf wordcount.jar wordcount/
cd ../../../; mv src/main/java/wordcount.jar jar/

Option 2: Maven

cd application_sources/
mvn clean package

Click and drag the diagram to pan, double click or use the controls to zoom.

Version History

COMPSs 3.3 (earliest) Created 4th Dec 2023 at 14:22 by Raül Sirvent

Run with COMPSs 3.3 version


Frozen COMPSs-3.3 3f5738b
help Creators and Submitter
Creator
Additional credit

The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

Submitter
Citation
Ejarque, J. (2023). Java COMPSs wordcount example (laptop run, files used as inputs). WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.684.1
Activity

Views: 1461   Downloads: 184

Created: 4th Dec 2023 at 14:22

Last updated: 4th Dec 2023 at 14:25

help Attributions

None

Total size: 16.2 MB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH