Using Data Analytics & Visualization

As with BG/P, the machine is split into two parts. Eureka is on the Intrepid filesystem, and Gadzooksis on the Surveyor filesystem, so you can process your files without moving them. 

More information on Eureka and Gadzooks can be found in their system guide.

Data Analysis and Visualization Resources

  • Visit
  • ParaView (Client/Server)
  • ParaView (Standalone)

Visualization-Related Presentations

ParaView Red Blood Cell Tutorial

Goals

This tutorial is intended to be a hands-on resource for users interested in learning the basic concepts of ParaView. The examples can easily be run on a laptop, using the example data set provided.

  • Tour of ParaView
  • Show range of visualization methods
    • Walk through various visualization techniques, hopefully illustrate how these can apply to your own data.
  • Feel for ParaView "way"
    • Terminology and step-by-step process peculiar to ParaView, which may differ from other packages, e.g. VisIt.

cutting planes: close-upRBCs: close-up

RBC: Continuum

Data

The data used for this tutorial is:

  • Blood flow simulation data
  • Multiple data types
    • Continuum data field (unstructured mesh, tetrahedral): fluid field, plasma
    • Particle data (unstructured points): individual particles moving in the flow
    • Red Blood Cells (RBC, unstructured mesh, triangle): mesh of the surface of an RBC
      • Healthy
      • Diseased
  • Generated using an integrated Nektar/LAMMPS simulation code
  • Courtesy of George Karniadakis and Leopold Grinberg of Brown University

The data is available for download here (~27MB compressed, ~39MB uncompressed):

Data set for ParaView Red Blood Cell Tutorial

File Open Icon

Load Multi-component Dataset

 

  • From the File menu, (you can also click the file folder icon, shown above) open each of the following data sets (select then click "OK").
  • The files will then appear in the Pipeline Browser
  • Click Apply in the Object Inspector
  • You will need to do this one at a time:
    • continuum...vtu
    • particles...vtu
    • rbc_...vtu
    • bad_rbc...vtu
  • The "..." in the name, and the arrow in the file browser, indicates that there are multiple time steps for each of these files.

PV GUI: File OpenPV GUI: Pipeline Apply

 

With all of the default settings, you should see something like this:

 

RBC: Default View

Select which data to view

Let's start by looking at the continuum.000** data. This is an unstructured mesh that has velocity and count (density) values.

 

  • Hide other data sets using the Eyeball icon next their names in the Pipeline Browser.
    • Black = visible, Grey = hidden
  • Select continuum.000** (name is highlighted) in the Pipeline Browser
    • Click on the name to highlight it
  • When manipulating appearance or applying filters, these always affect the selected data set
  • Switch to the Display tab in the Object Inspector.
  • Under Color by, select Velocity from the dropdown.
    • There is also a shortcut to Color by in the menu bar near the top of the GUI.
PV GUI: View 4 RBC: Continuum

Manipulating the color map

To change the colors used to represent the Velocity:

 

  • Under Color by click the Edit Color Map... button
  • On the Color Scale Editor window click the Choose Preset button
  • On the Preset Color Scales window, select: Blue to Red Rainbow, and click OK. Then click Close on the Color Scale Editor window
  • You can also create and save your own color maps

PV GUI: Edit Color Map PV GUI: Color Scale Editor

RBC Continuum Color2

 

Data representation

In order to be able to see the particles and red blood cells inside the cylinder, we need to be able to see through it. If we scroll down a bit in the Object Inspector view:

  • Group of controls labeled Style
  • In the Representation dropdown, select Wireframe
PV GUI : Object Inspector - Wireframe

RBC Continuum: Wireframe

 

PV Streamline Icon

Generate Streamlines

 

  • ParaView enables the generation of different types of data from existing data sets in the Pipeline
  • Streamlines: Generated from vectors of the flow field. These curves show the direction a fluid element will travel in at any point in time
  • Make sure that the continuum.000* data is selected in the Pipeline Browser
  • From the main menu select: Filters->Alphabetical->Stream Tracer, or click on the Stream Tracer icon from the menu bar
  • In the Object Inspector make sure the Properties tab is selected.
  • Scroll down to seeds, and change Seed Type to Line Source
  • Click the Y Axis button to set the seed line to run along the Y axis.
  • The default Resolution is set to 100. This will make things a bit cluttered, especially when we start adding in the other data, so let's reduce this to 25
  • Click the Apply button.
PV: Streamline GUI

RBC Continuum: Streamlines

 

 

Streamlines as Tubes

The streamlines are just that, lines. We can use the Tubes filter to represent them as 3D objects, rather than just lines.

 

 

  • With StreamTracer1 selected in the Pipeline Browser, from the main menu select: Filters->Alphabetical->Tube
  • In the Object Inspector make sure the Properties tab is selected.
  • The default value for the Radius is a bit too large for this data, let's set that value to 0.1
  • Click the Apply button.
  • Notice that the StreamLine1 object has automatically been hidden.
  • There are many different ways to color these tubes.
  • With Tubes1 selected, switch to the Display tab in the Object Inspector.
  • The Color by dropdown lets you choose from a handful of different variables.
PV: Tube GUI

RBC Continuum: Tubes

PV Cut Icon

Cutting Planes (Slices)

Now let's add some cutting plans, or slices, to see what the cross-section of the continuum data looks like.

 

 

  • Again, be sure that the continuum.000* data is selected in the Pipeline Browser.
  • Filters->Alphabetical->Slice or Click on the Slice icon from the menu bar
  • In the Object Inspector make sure the Properties tab is selected.
  • At the bottom on the Object Inspector is a section titled Slice Offset Values. Here we can generate values for multiple slices to be made.
  • First click the Delete All button to remove initial values
  • Next, click the New Range button. This will bring up an Add Range dialog box.
  • Set the number of Steps to 7. Click OK
  • Click the Apply button.
  • With Slice1 selected in the Object Inspector, switch to the Display tab
  • Set Color by value to Velocity.
PV Slice GUI PV Slice Range GUI

Red Blood Cell : Cutting Planes 1

Data representation: Opacity

Even with the continuum data represented as wireframe, there is still considerabe occlusion of the interior structures. In order to further reduce this occlusion by the wireframe, we can make it more transparent.

  • Again, be sure that the continuum.000* data is selected in the Pipeline Browser.
  • In the Object Inspector make sure the Display tab is selected.
  • In the Object Inspector there is a section titled Style.
  • Set Opacity to 0.2
PV GUI Opacity

RBC cutting planes 2

Animating Simulation Data

Since our data has multiple time steps, we can easily animate through them to see how the data changes over time.

  • Simply click the Play button on the animation bar at the top of the GUI
  • Pause to make it stop
  • Loop: With this button toggled on, animation will repeat until stopped
PV Animation Controls

Animations

Animations can be saved to disk as a movie file, to be played back later.

 

  • From the main menu: File->Save Animation
  • Animation Settings Dialog: Save Animation
  • Files of type: AVI files (*.avi)
  • Enter a name in File name:
  • Click OK
  • Movie can be played back with standard media players (Windows Media Player, QuickTime, VLC, etc.)
PV Animation Dialog PV Save Animation

PV Glyph Icon

Particles as Glyphs

Glyphs are another way of visually representing data where the attributes of a graphical element are dictated by attributes of the data.

All of the particles are displayed as red points in the graphics window. There are ~39K particles in this particular data set, which makes the display a bit cluttered. In order to both filter some of these out, and create 3D representations for them, let's apply a glyph filter to this data.

Now let's add some of our other data back into the scene. Let's start with the particle data.

All of the particles are displayed as red points in the graphics window. There are ~39K particles in this particular data set, which makes the display rather cluttered. In order to both filter some of these out, and create 3D representations for them, we will apply the glyph filter to this data.

Notice that the particles.000* is still visible.

 

  • Unhide the particles.000* data: click Eye icon
  • Select particles.000* data: click on name
  • Filters->Alphabetical->Glyph or click on the Glyph icon from the menu bar
  • Glyph Type: Sphere
  • Radius:. 0.15
  • Orient: Unchecked
  • Scale Mode: off
  • Set Scale Factor: 1 - Edit: Checked
  • Maximum Number of Points: 3000
  • Mask Points: Checked
  • Random Mode: Unchecked
  • Click the Apply button.
  • Since our goal was to unclutter the display, let's hide the particles.000* by toggling them off, by clicking on the Eye icon next to it in the Pipeline Browser
  • Let's also switch to the Display tab in the Object Inspector, with Glyph1 selected, and change the Color by value to GlyphVector. Since the GlyphVector value is based on the velocity. We can Edit Color Map... and choose the same Blue to Red Rainbow preset that we previously chose for velocity
RBC Particles
 

PV Glyph GUI

RBC Glyphs (Red)

RBC Glyphs (Multi)

Enter: Red Blood Cells

Now let's add in both of the other data sets, which are polygonal meshes which make up Red Blood Cells (RBCs).

These two data sets are essentially the same kind of data, so we can apply the same filters and make the same types of representation changes to each of them. However, some of the RBCs are marked by the simulation that generated them as healthy (rbc.000*) and some of them are marked as diseased (bad_rbc.000*).

  • Unhide the rbc.000* and bad_rbc.000* data sets by clicking the Eye icon next to each of them to make them visible
RBCs: Healthy

RBCs: Healthy + Diseased

Using color to differentiate data

To enable us to distinguish these two types of data from one other, we can vary their representations.

One way to do this is by setting the color of the two data sets to different colors. Repeat this process for each of rbc.000* and bad_rbc.000*, picking different colors.

 

  • Select one of the rbc data sets in the Pipeline Browser
  • Go to the Display tab in the Object Inspector
  • In the Color by: dropdown select Solid Color
  • Click on the Set Solid Color... button
  • Select a color from the Select Color dialog that appears.
  • Repeat for the other RBC data set, choosing a different color.
PV Color By GUI PV Color Selector

RBCs: Colored

Further Exploration: Highlight the Mesh

Change the representation of one of the RBC data sets.

In this example, the continuum.000* data is also hidden to reduce confusion with showing multiple overlapping meshes.

  • Select on of the RBC data sets
  • Go to the Display tab in the Object Inspector
  • For the Representation select Surface With Edges
  • In the Edge Style section click on the Set Edge Color... button to select a different color from the Select Color dialog.
PV: Display Surface with Edges

RBC: Highlight Mesh

Further Exploration: Highlight the Vertices

Add glyphs to illustrate the position of the vertices of one of the RBC data sets.

  • Select one of the RBC data sets
  • Select the Glyph filter
    • Since this filter was used recently, can also be found under: Filters->Recent->Glyph
  • As in the earlier example, set the various configuration options for the glyph attributes. Note that this time, we want to show all of the vertices of the RBC, so we should uncheck the Mask Points option.
RBC: Highlight Vertices

Further Exploration: Color by Variable

Try playing around with the viewing options and representations of the other data objects.

Change the:

  • Color by values
  • Opacity
  • Representation
  • etc.

RBC: Color by Variable

 

 

 

Background color

 

  • Background color is an important part of final visualization
  • From the main menu choose: Edit->View Settings...
  • Under General in the View Settings dialog box, select Choose Color
  • Select Color: OK
  • Apply, then OK.
PV GUI Set Background Color
 

RBC: Black background

This tutorial was developed with support from National Science Foundation Grant OCI-0904190, and from the Argonne Leadership Computing Facility at Argonne National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under contract DE-AC02-06CH11357.

 

ParaView on the Data Analytics Cluster

Set Up User Environment

The recommended way of running ParaView on Eureka is in client/server mode. This consists of running the ParaView client on your local resource, and the ParaView server (pvserver) on the Eureka visualization nodes. The latest version currently installed on Eureka is ParaView 3.14. This should first be installed on your local resource. Binary and source packages for Linux, MacOS, and Windows are available from the ParaView Download Page.

To put ParaView in your environment on Eureka, add the following line to your ~/.softenvrc file (Note: This must go before the @default entry.  Also note that this macro conflicts with the @visit macro used for enabling VisIt, so only one of these lines should be active at a time.  The other should be removed from your .softenvrc file, or commented out using the '#' character at the beginning of the line.):

 @paraview-3.14.1

Then run the command "resoft".

In order for ParaView to take advantage of the accelerated graphics on Eureka, the DISPLAY environment variable needs to be set to the local X server. This can be done by adding a few lines to the configuration file that sets up your default shell on Intrepid/Eureka.

Bash users should add the following to your ~/.bashrc file on Intrepid/Eureka:
(Note: Be careful to ensure that the characters around the word hostname are backticks, and not single quotes, especially if cutting and pasting the text.)

 if ( echo `hostname` | grep -sq 'vs' ); then
   export DISPLAY=:0.0
 fi

Csh/Tcsh users should add the following to your ~/.cshrc file on Intrepid/Eureka:
(Note: Be careful to ensure that the characters around the word hostname are backticks, and not single quotes, especially if cutting and pasting the text.)

 if(  `hostname` =~ '*vs*' ) then
   setenv DISPLAY :0.0
 endif

This will tell any process that runs on a visualization node to use display :0.0 (without changing any settings you may have set on the login node). This means that there can only be one pvserver process per node, even though each node has 2 graphics cards. 

Start the ParaView Server

In order to connect the ParaView client to a running pvserver, you need to know the host (and port) where the server is listening. This will be the head node of the job. There are several ways to do this, but probably the easiest is to submit an interactive job, starting from a shell on a eureka login node:

  login1.eureka:~> qsubi -n 4 -t 60 -A project_id

The -A project_id is required only if you have multiple projects. When the job starts, you will automatically be logged into the head node of your job. Be sure to take note of the hostname of the host that you end up on, as you will need it in order to connect your ParaView client to the pvserver.

We manually run the mpiexec command to start pvserver:

  vs37:~> mpiexec -machinefile $COBALT_NODEFILE -np 4 pvserver
  Waiting for client..
  Connection URL: CS://vs37:11111
  Accepting connection(s): vs37:11111

Because of the current display settings, you should run 1 process on each node. So the -n value passed to qsubi and the -np value passed to mpiexec above should be the same. Otherwise, multiple processes on the same host will step on each other while trying to access the same graphics card.

Once the pvserver is running, and is "Accepting connection(s)" as shown above, you can connect to it from the ParaView client on your local resource. To do this we will need to set up a server configuration in your local ParaView client. (We will only need to do this once, details below.) We will also need to set up an ssh tunnel from your local resource through the Eureka login node, to the visualization node where your pvserver is listening. This will be the host where you ran mpiexec above. (This will need to be done each time you submit a new qsubi, since you will likely end up on a different host each time.)

Set Up SSH Tunnel

Linux/Unix/MacOS:

From a shell on your local resource, run the following command, substituting vs37 with the node where your pvserver is listening (where you ran mpiexec above). If your local username is different from your login on Eureka, you should include it in the commandline below (Note: You will be prompted for your CRYPTOcard one time password. After you authenticate, the ssh tunnel will be established, but you won't receive any output or be returned to a command prompt. This is normal.  When you are done with your session, use Ctrl^C to close the tunnel.):

 ssh -NL 11111:vs37:11111 username@eureka.alcf.anl.gov

Windows (using PuTTY):

Download PuTTY from: http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

Right-click on putty.exe, save link as: C:\putty.exe

Create SSH tunnel by first starting Command Prompt:
Start->Programs->Accessories->CommandPrompt

Then start PuTTY using the following command, substituting vs37 with the node where your pvserver is listening (where you ran mpiexec above). If your local username is different from your login on Eureka, you should include it in the commandline below (Note: You will be prompted for your CRYPTOcard one time password. After you authenticate, the ssh tunnel will be established, but you won't receive any output or be returned to a command prompt. This is normal.):

 ​c:\putty -ssh -NL 11111:vs37:11111 username@eureka.alcf.anl.gov

Start ParaView Client

You should now launch the ParaView client on your local resource. In order to connect to our running pvserver, we will need to configure some server settings in the client. This should only need to be done once, and can be reused each time you run ParaView on Eureka.

Server Configuration

Select Connect

From the ParaView client choose to connect to a server by either clicking on the "Connect" icon in the menu bar, or selecting:

File->Connect

From the main menu.

Add a new server (first time only)

Once we set up this server, we can reuse it each time we connect the ParaView client to the pvserver on Eureka, provided we always use the same port number on the local resource, and we establish the ssh tunnel as described above.

Click "Add Server"

Configure Server, Part 1 (first time only)

Configure the server by first giving it a Name, such as eureka

Select Server Type: Client/Server

The Host and Port values can be left as the defaults, localhost and 11111, respectively.

Click "Configure"

Configure Server, Part 2 (first time only)

Because we are going to connect to a ParaView server that we have already started, we don't need the ParaView client to start a server for us.

Select Startup Type: Manual

Click "Save"

Connect

Now that we have a server defined and configured, highlight it in the list.

Click "Connect"

 

Now when you select File->Open from the main menu, you will be browsing the filesystem on Eureka. 

Using Visit

Getting Started

On your local machine:

  • Download (https://wci.llnl.gov/codes/visit/download.html) and install VisIt (The latest version installed on eureka is 2.5.2  Note: If while installing you opt to install the ANL host profiles, you may need to replace <path_to_install>/2.5.2/.visit/hosts/host_anl_eureka.xml with this visit host profile for eureka)
  • Download the visit host profile for eureka  (you may need to right-click and choose "Save link as..." or "Save target as...")
  • Copy this file to a file called ~/.visit/hosts/host_anl_eureka.xml (Note: the file extension should be .xml, not .txt).  

On the Eureka login host:

  • Edit your .softenvrc file to include the "@visit" key before the "@default" line (Note: If you have an @paraview key, or an mpi key other than +mpich-mx-1.2.7..7 in your .softenvrc file, you should comment these out using '#', as these will conflict with the version of mpi that VisIt uses.)
  • If desired, set a default project by setting the $COBALT_PROJ environment variable

Running VisIt

  • Start up VisIt on your local machine
  • Click File -> Open File and choose "login1.eureka.alcf.anl.gov" from the "Host" dropdown
  • You'll be prompted for your password; enter your Cryptocard response (with PIN, if you've been assigned one)
  • When you open a selected file, it will launch a job on Eureka
    • By default, the "Bank" field will use your $COBALT_PROJ, but you can also specify a project in the Options box
    • If your environment doesn't get sourced correctly with non-interactive SSH, you can set the default project to use under Options -> Host profiles (see below)
    • Don't change the contents of the "Machine file" field (it should be $COBALT_NODEFILE)
    • If you'd like to change other job parameters (like the number of processes, nodes, and walltime), you can do so here
  • You can change parallel engine defaults in Options -> Host profiles
    • Choose the "login1" tab at the top of the window
    • Cobalt/mpirun options can be changed in the Parallel options tab
    • If you make any changes here, make sure to save your changes
  • Important: Using the above host profile the serial Launch Profile will be set as the default.  Do not change this setting.

Additional Info

Visit user manual: https://wci.llnl.gov/codes/visit/manuals.html

Visit wiki: http://www.visitusers.org/index.php?title=Main_Page