Scalasca
(Scalasca 2.6.1, revision 5d44271d)
Scalable Performance Analysis of Large-Scale Applications
|
square [OPTIONS] (EXPERIMENT_DIR | CUBE_FILE)
square, the Scalasca analysis report explorer, facilitates post-processing, scoring, and interactive examination of analysis reports from both runtime summarization and tracing experiments.
When provided with a Score-P experiment directory EXPERIMENT_DIR, square post-processes intermediate analysis reports produced by a measurement and/or an automatic trace analysis to derive additional metrics and construct a hierarchy of measured and derived metrics, and then presents this final report using the Cube GUI (unless the -s option is used). If intermediate reports were already processed, the final report is shown immediately. If there is more than one analysis report in a Score-P experiment directory, the most comprehensive report is shown by default.
When provided with the name of a specific analysis report CUBE_FILE, post-processing is skipped and the corresponding report is shown immediately.
Analysis report examination can only be done after measurement and analysis are completed. Parallel resources are not required, and it is often more convenient to examine analysis reports on a different system, such as a desktop computer where interactivity is superior.
Depending on the measurement configuration and the provided options, square places additional files into the experiment archive directory. For single-run experiments, the following files are created if the corresponding input files are available:
summary.cubex
: post-processed runtime summary resulttrace.cubex
: post-processed trace analysis resultIn scoring mode (-s option), square generates:
scorep.score
: detailed measurement score report, optionally suffixed with the name of a provided filter file (-f option)In multi-run mode, aggregated reports are created if the corresponding input files are available:
profile_aggr.cubex
: aggregated runtime summary resultscout_aggr.cubex
: aggregated trace analysis resultscout+profile.cubex
: merged runtime summary and trace analysis resultsummary_aggr.cubex
: post-processed aggregated runtime summary resulttrace_aggr.cubex
: post-processed aggregated trace analysis resulttrace+summary.cubex
: post-processed merged runtime summary and trace resultLevel of sanity checks for newly created reports (default: 'none'). 'quick' performs various sanity checks on the experiment meta data, while 'full' also executes a more time-consuming check for negative metric values (which usually indicate a serious error).
Specifies the number of hard- and software counters that shall be considered when generating a score report (option -s). By default, this value is 0, which means that only a timestamp is measured on each event. If you plan to record extra counters specify the number of counters. Otherwise, scoring may underestimate the required space.
Force post-processing of analysis reports, even if a post-processed report already exists.
Apply the specified filter file when generating a score report (option -s).
Output a textual score report. Skips launching the Cube GUI.
Enable verbose mode.
Suppress the calculation of 'Idle Threads' metric.
Set aggregation mode for runtime summarization results of each configuration. Currently supported modes are 'mean' and 'merge' (default).
Set aggregation mode for trace analysis results of each configuration. Currently supported modes are 'mean' and 'merge' (default).
For multi-run experiments, square provides additional options to aggregate the set of measurement results into a single Cube file. The user can choose between the two aggregation methods 'merge
' and 'mean
' to combine results from different configurations, which underneath use the corresponding CubeLib command-line tools. The default aggregation mode is to 'merge
' results.
merge
' operation always copies metric data from the last measurement configuration in a given set in which data for a particular metric is available. This should be taken into account when setting up a multi-run experiment that is supposed to be aggregated using the square command later on. In particular, it is recommended to include a low-overhead measurement without hardware performance counters at the end of a measurement configuration set including hardware counter measurements in order to provide more accurate time information.The aggregation of multi-run measurement results happens in the following order:
mean
', which is therefore hard-coded.Depending on the measurement settings, those steps will be applied if the respective intermediate results are found. Before merging intermediate results, square performs sanity checks to compare the call-tree structure to ensure merging will result in a valid Cube file. In rare cases, where the user is aware of potential call-tree differences, it may be necessary to skip these checks, which can be accomplished by passing the -I option. However, note that this may produce erroneous or at least misleading results. The reports of the individual runs will only be post-processed when explicitly requested (-A option).
square exits with status 0 on success, and greater than 0 if errors occur.
To examine an analysis report on a different system, for example, a desktop or laptop computer, it is often best to post-process the report using square's scoring functionality (-s option) on the system where the measurement has been taken, and then copy over the resulting post-processed Cube file. This is because square requires various command-line tools and support files from the Score-P, CubeLib, and Scalasca Trace Tools packages, which may not be available on the target computer.
square scorep_foo_4_trace
Post-process measurement reports in scorep_foo_4_trace and display the most comprehensive report using the Cube GUI.
square -s -f filter scorep_foo_4_sum
Post-process measurement reports in scorep_foo_4_sum and generate a score report with the run-time measurement filter rules from the file filter applied.
square -S mean scorep_foo_4_multi-run_c2_r4
Aggregate and post-process the measurement results of the multi-run experiment with two configurations and four runs per configuration stored in scorep_foo_4_multi-run_c2_r4. Then, show the most comprehensive report using the Cube GUI.
Copyright © 1998–2022 Forschungszentrum Jülich GmbH,
Jülich Supercomputing Centre
Copyright © 2009–2015 German Research School for Simulation Sciences GmbH, Laboratory for Parallel Programming |