Skip to content

Discovery Environment User Guide

Deployment of the Discovery Environment (DE) is provided in the Deployments section

de_condor

Figure DE executable jobs use HTCondor

de_vice

Figure DE interactive jobs use K8s

Infrastructure

The following sections describe the key components of the infrastructure upon which the DE operates.

Data Store

The DE provides access and management of data via the CyVerse Data Store, which is built on top of iRODS.

Compute Platform(s)

The DE integrates the Data Store with HTCondor and the Agave Platform to provide a large set of tools for performing resource intense analyses.

PostgreSQL

Nearly all applications use a database. The DE is backed by a PostgreSQL database.

The schema can be found in the de-database repository.

RabbitMQ

RabbitMQ is used throughout our services, but primarily to integrate our services with iRODS.

Elasticsearch

The DE uses Elasticsearch to provide search and other capabilities.

Docker

Docker is used throughout the DE architecture. Most importantly, all of the tools that run in the DE's HTCondor cluster run within docker containers, allowing us to integrate new tools without affecting existing tools.

Additionally, the components of the DE application are packaged as Docker containers.

All of these images can be accessed through our organization page on Docker Hub.

Architecture

The DE is composed of backend services and a user interface (UI).

Backend Services

The DE backend is built as a micro-services architecture. Each of these services are contained in the services/ folder.

The functionality of the micro-service architecture is aggregated in the Terrain service, and exposed as a RESTful api.

More information about the backend micro-service architecture and implementation may be found here.

UI

All of the UI services are provided by the DE api. The application itself is built with GXT, a UI component library built on top of GWT.

Discovery Environment

Here you will find all the GitHub repositories for all services:

analyses : Provides a HTTP API for interacting with analyses in the Discovery Environment.

app-exposer : This is a service that runs inside of a Kubernetes cluster namespace and implements CRUD operations for exposing interactive apps as a Service and Endpoint.

apply-labels : A small service in the Discovery Environment backend that periodically hits the app-exposer service to trigger the application of labels on VICE-related K8s resources

apps : apps is a platform for hosting App Services for the Discovery Environment web application.

async-tasks : This service tracks and manages asynchronous tasks throughout the DE backend services.

bulk-typer : Like info-typer, but a bunch at once. hopefully.

check-resource-access : Looks up the permissions that a subject has for a resource. By default, the subject type is 'user' and the resource type is 'analysis'. Only performs look ups against the permissions service.

clockwork : Scheduled jobs for the CyVerse Discovery Environment.

dashboard-aggregator : Gathers data to populate the dashboard in Sonora.

data-info : data-info is a RESTful frontend for getting information about and manipulating information in an iRODS data store.

data-usage-api : A service that provides an API around data usage tracking, and updates data usage numbers on request or periodically.

de-mailer : A go module that send email notifications to users. This module will support HTML and rich text emails.</del>

de-nginx

de-stats : Service for obtaining CyVerse Discovery Environment stats and metrics.

de-webhooks : A service that listens to AMQP queues for DE notifications, check if the user has webhooks defined for that notification type and then post the notification to webhook if one is defined.

dewey : An AMQP message based iRODS indexer for elasticsearch.

email-requests : A simple service to wait for email requests to arrive over AMQP and forward them to cyverse-email.

event-recorder : This service listens to an AMQP topic for events, and records events that may be of interest to users in the notifications database.

get-analysis-id : TODO: FIND the repo.

grouper : - grouper-loader & grouper-ws TODO: FIND the repo. - From kubectl apply

info-typer : An AMQP message based info type detector

infosquito2 : TODO FIND description.

iplant-groups : A RESTful facade in front of Grouper.

jex-adapter : TODO FIND description.

job-status-recorder : TODO FIND description.

job-status-listener : Listens over HTTP for job status updates, then publishes them to AMQP.

kifshare : A simple web page that allows users to download publicly available files without a CyVerse account.

local-exim (exim-sender): TODO: FIND the repo. - From kubectl

metadata : The REST API for the Discovery Environment Metadata services.

monkey : This is a service that synchronizes the tag documents in the data search index with the metadata database

notifications : This service provides the RESTful API for the revised notification system.

permissions : This service manages permissions for the Discovery environment.

requests : Service for managing administrative requests in the CyVerse Discovery Environment.

terrain : Terrain provides the primary REST API used by the Discovery Environment. Its role is to validate user authentication and to coordinate calls to other web services. For more information, please see the Discovery Environment API Documentation .

resource-usage-api : is a microservice developed as part of the CyVerse Discovery Environment that provides access to resource usage values (CPU hours, memory, etc.) consumed by users over a customizable time period.

saved-searches : A service for the CyVerse Discovery Environment that provides CRUD access to a user's saved searches.

search : This is a service which serves as a search facade for the DE and others to use. It uses the querydsl library under the covers to translate requests and provide documentation, then passes off queries to configured elasticsearch servers.

sonora : UI for the Discovery Environment

templeton : includes templeton-incremental & templeton-periodic TODO: add description.

timelord : timelord periodically queries the DE database for running applications and kills any of them that have gone over their time limit.

unleash : TODO: FIND the repo. - From kubectl apply

user-info : A service for getting user-related information like sessions and preferences.

vice-default-backend : Provides a default backend handler for the Kubernetes Ingress that handles routing for VICE apps. This backend decides whether to redirect requests to the loading page service, the landing page service, or to a 404 page depending on whether the URL is valid or not.

qms-adapter : Forwards usage information gathered within the Discovery Environment to the Quota Management System [QMS](https://github.com/cyverse/QMS

qms : QMS is the CyVerse Quota Management System. Its purpose is to keep track of resource usage limits and totals for CyVerse users.

irods-csi-driver : iRODS Container Storage Interface (CSI) Driver implements the CSI Specification to provide container orchestration engines (like Kubernetes) iRODS access.

dataone-indexer : Event indexer for the DataONE member node service.</de

Non-Core Services

  1. redis-ha

  2. redis-haproxy

  3. elasticsearch (opensearch)

  4. keycloak

  5. openebs