{% extends "guide.html" %} {% block guide %}

Portal Technology Stack

Portal Overview

CEP, Core Experience Portal, is the core engine for all TACC portal projects. The primary goal of the Core Portal project is to establish a codebase that can be used as a springboard for all future portal projects undertaken by TACC. By establishing a common codebase for all portal efforts, we can better maintaining alignment between the core capabilities and technologies supported across all TACC portals. There will be unique requirements in some portal projects, but CEP should provide an "out-of-the-box" framework for rapidly deploying a new portal project with all the common capabilities already in place and compliant with current best practices and conventions at TACC.

Common Portal Capabilities:

CEP Major Components:

Note: Any additional portal capabilities required by a project need to be identified and planned for independently.

High-level Architecture

The portal architecture operates in a tiered structure. Listed below in order from the outermost-tier and going inward, they are:

Layer 4
The user-facing web portal that enables users to interact with Agave through a browser-based GUI.
Layer 3
The Agave API that exposes access to Layer 2 and Layer 1 resources over a RESTful API as well as via a CLI for programmatic interaction with resources.
Layer 2
The middleware service that enables data management, job creation, and job scheduling (eg. Slurm, Kubernetes) across all layer 1 resources.
Layer 1
The physical infrastructure (storage, HPC and Cloud systems) where data and applications are stored, manipulated, and executed (Corral, Stampede2, Lonestar5, etc.).

Portal high-level architecture diagram as a triangle pointing up and segmented into four layers, where the top is layer 4 and the bottom is layer 1

Backend (Server-side)

Agave Platform
(http://agaveapi.co) An open source, science-as-a-service API platform that provides HPC and file management integrations. (Agave is developed in-house).

HPC Connectivity
From within the portal access to HPC is primarily through Agave. Agave calls in turn submit jobs through Slurm. Multiple platforms are used based on the particular application (or simply for availability) -- Lonestar5 and Stampede2 are the primary target platforms for app deployment
File Storage
Files are stored on Corral, a mirrored GPFS storage facility, and backed up to Ranch. From the web interface, all File I/O is done through Agave calls, to maintain consistent metadata. Users also have the option to upload through Globus online, or through public cloud storage facilities (Dropbox, Box, Google Drive) -- the web integration of the cloud providers pushes imports through Agave to keep metadata consistent.
Applications
Portal applications have three distinct components. The applications themselves are installed on the execution systems -- HPC platforms, or containers that run within the virtual infrastructure. Applications are then registered with the API -they are defined as Agave application records, pointing to a zip file on corral, with associated metadata records for further use. a JSON document for each application defines the UI. Runtime instruction inputs are supplied by the jobs created by portal through Agave. Agave also supplies callbacks for updates on a given job’s status. More details on application deployment are provided in the section entitled “Extensibility”. Application templates are available at the template GitHub repository.
API
API manager (APIM) sits in front of Agave core handles auth, proxying/routing, client management, analytics, rate limiting, and many other features. It provides a unified namespace for the entire API to be hosted behind.
Projects API
Returns project listings and associated files utilizing the Django framework.

Frontend (Client-side)

Web-based Portals

Portals using Agave to manage apps, data storage, reconfigurable workflows, and to interact with HPC resources. The architecture for the web portal that provides data management, analysis, and simulation capabilities to users through a web interface. The dashboard provides overview of jobs status, data storage, allocations, and system status. Users will primarily interact with the portal through the Workbench which will include: Manage Account, Data Files, Application Workspace, Search, and Help.

My Dashboard

The dashboard displays availability of TACC resources, and user allocation usages. The CEP infrastructure runs on TACC hosted Virtual Machines, Django/Angular web portal with responsive layout. Every Portal project will have a dedicated VM resources. system status instrumentation will be provided via APIs. The AngularJS framework works by first reading the HTML page, which has additional custom tag attributes embedded into it. Angular interprets those attributes as directives to bind input or output parts of the page to a model that is represented by standard JavaScript variables. The values of those JavaScript variables can be manually set within the code, or retrieved from static or dynamic JSON resources.

Data Files

Data Files is a collection of storage spaces where user and project data are located, stored, and ultimately organized by users to curate publications and share information. Data Repository is the place where experimental and simulation results are stored for long term. Working storage is the area to share and collaborate with data that is not yet published, Workspace to allow for the analysis of data, a gateway to large scale HPC resources and simulation tools. The data is organized in three categories:

Applications

APPS are executable code available for invocation through Agave’s Jobs service on a specific execution system. If a single code needs to be run on multiple systems, each combination and system needs to be defined as an app. Code that can be forked at the command line or submitted to a batch scheduler can be registered as an Agave app and run through the Jobs service. The user sends a request by filling in the application inputs and job details on CEP portal. The app is packaged with 3 files:

App.json
JSON file with application definitions and parameters such as (name, version, label).
Wrapper.sh
wrapper file, shell script which executes the job on HPC/Docker.
Test.sh
runs a test before the job is launched and cleans up the files.

Notifications

Notification Bell enables the user to view information and status of submitted jobs.

Search

CEP has a multi-tenant capable full-text search engine with an HTTP web interface and schema-free JSON documents based on ElasticSearch. Users can search for data files, or text.

Environment

The Core portal utilizes a wide variety of technologies developed by multiple technology vendors. Below is a list of the primary libraries, frameworks and APIs leveraged in the core portal tech stack.

NginX
When user sends http request to call the CEP URL, it goes to nginx webserver, a proxy, intermediary for requests from clients seeking resources from other servers. Nginx carry the request to the Web Server Gateway.
WSGI
Web Server Gateway Interface, a calling convention for a web hosting server to forward requests to web applications or frameworks written in the Python programming language. It forwards the CEP user request to Django.
Django
A backend server manages the requests and response cycle. Pass the CEP user request to Angular and the return back information (Python base)
Angular
The JavaScript framework taking care of all the frontend client-side requests, activities that taking place on that page, Requests method such as get, post, put.
PostgreSQL
The object-relational database management system
Agave Platform
is an open source, science-as-a-service API platform, securely manage, move, and share data metadata, shared high performance computing (HPC) Cloud, and Big Data resources under a single, web-friendly REST API. Run code, simplifies building web portals that use back-end computing, and run executable systems.
RABBITMQ
The portal Messages Broker, enterprise messaging system modeled on the Advanced Message Queuing Protocol (AMQP) standard. a message broker that acts as an intermediary platform to processing communication between two applications.
CELERY
A Distributed message passing queue.
HPC
High Performance Computers, TACC super computer resources (Frontera, Stampede, etc.)
CMS
The Content Management System used in the core portal.
{% endblock guide %}