Frequently Asked Questions (FAQ)¶
Common questions about deploying, administering, and developing with CyVerse infrastructure.
Deployment & Installation¶
How do I deploy CyVerse on my infrastructure?¶
Start with the Getting Started Guide to review prerequisites, then follow the Deployment Guide for step-by-step deployment instructions. The deployment follows this sequence:
- Deploy Kubernetes cluster
- Set up storage (OpenEBS) and networking (Ingress NGINX)
- Deploy core services (KeyCloak, RabbitMQ, Redis, ElasticSearch)
- Provision databases
- Deploy application services (Discovery Environment, User Portal, VICE)
See the deployment roadmap for detailed phase breakdown.
What are the minimum hardware requirements?¶
Minimum recommended configuration:
- 8-node Kubernetes cluster
- 32 CPU cores per node (256 cores total)
- 128GB RAM per node (1TB total)
- 10TB+ persistent storage (NFS, iRODS, or cloud block storage)
- 10 Gbps network connectivity
- Public IP addresses for ingress services
For smaller deployments (testing/development): - 3-node cluster - 16 cores per node - 64GB RAM per node - 1TB storage
See Prerequisites for complete requirements.
Can I deploy CyVerse on commercial cloud providers (AWS, GCP, Azure)?¶
Yes! CyVerse can be deployed on:
- :simple-amazonaws: AWS - Use EKS for Kubernetes, EBS for storage
- Google Cloud - Use GKE for Kubernetes, Persistent Disks for storage
- :simple-azuredevops: Azure - Use AKS for Kubernetes, Azure Disks for storage
- OpenStack - CyVerse's primary cloud platform
You'll need to adapt storage provisioning and load balancer configurations for your cloud provider. See the Kubernetes deployment guide.
What Kubernetes distribution should I use?¶
CyVerse is tested with:
- Upstream Kubernetes (1.23+) - Recommended for most deployments
- OpenShift - Enterprise Kubernetes with additional security features
- Rancher - Kubernetes management platform
- Cloud-managed Kubernetes - EKS, GKE, AKS
Use the latest stable Kubernetes version supported by your distribution.
How long does a full CyVerse deployment take?¶
Typical timeline (experienced DevOps team):
- Weeks 1-2: Infrastructure setup (Kubernetes, storage, networking)
- Week 3: Core services deployment (KeyCloak, RabbitMQ, Redis, ElasticSearch)
- Week 4: Database provisioning and configuration
- Weeks 5-6: Application deployment (DE, User Portal, VICE)
- Week 7: Testing, tuning, and documentation
Total: 7-8 weeks for initial production deployment.
Development/testing environments can be deployed faster (2-3 weeks).
Do I need to deploy all CyVerse services?¶
No! CyVerse is modular. You can deploy:
- Core only: Discovery Environment for computational workflows
- Data Store only: iRODS-based data management
- Custom subset: Select services for your use case
All services require Kubernetes, PostgreSQL databases, and KeyCloak authentication.
How do I update/upgrade CyVerse services?¶
Updates are managed through:
- Helm chart updates - Pull latest charts from repository
- Container image updates - Reference new image tags in deployments
- Database migrations - Run migration scripts before deploying new versions
- Rolling updates - Kubernetes handles zero-downtime deployments
Always test updates in a staging environment first. See deployment documentation.
Administration¶
How do I grant VICE access to a user?¶
VICE (Visual Interactive Computing Environment) access is granted through the Discovery Environment admin interface:
- Log into the Discovery Environment as an administrator
- Navigate to Admin → User Management
- Search for the user
- Grant "VICE Access" permission
- User must meet requirements: valid account, accepted terms of service, sufficient quota
See DE Administration Guide for detailed procedures.
How do I process a DOI/Permanent ID request?¶
DOI requests for data publishing follow this workflow:
- User submits request through Discovery Environment (Data → Permanent ID Request)
- Admin receives notification
- Review submission in admin interface for completeness
- Validate metadata and data quality
- Approve request - system creates DOI via DataCite
- Notify user of published DOI
See Permanent ID Requests documentation for complete SOP.
How do I add a new application to the Discovery Environment?¶
Publishing a containerized app to DE:
- Containerize your tool - Create Docker image with tool/software
- Push to registry - Upload to Docker Hub or Harbor
- Create integration - Use DE "Apps" interface to define:
- Input files
- Parameters
- Output files
- Resource requirements
- Test privately - Validate app functionality
- Publish publicly - Make available to all users
See DE Administration Guide for step-by-step instructions.
How do I manage user storage quotas?¶
Storage quotas are managed through:
- iRODS resource management - Set per-user/per-group quotas in iRODS
- User Portal - Display quota usage to users
- Admin tools - Monitor and adjust quotas as needed
Contact users approaching quota limits and provide options (clean up data, request increase, archive to cold storage).
How do I back up CyVerse databases?¶
Database backup strategy:
# PostgreSQL backup example
pg_dump -h db-host -U username -d de_database > de_backup.sql
# Automated daily backups
0 2 * * * /usr/local/bin/backup-cyverse-dbs.sh
Backup all service databases: - DE database - Metadata database - Notifications database - KeyCloak database - QMS database - Grouper database - Portal database - Unleash database
See Database documentation for detailed backup procedures.
How do I troubleshoot service failures?¶
Systematic troubleshooting approach:
- Check pod status:
kubectl get pods -n <namespace> - View logs:
kubectl logs -n <namespace> <pod-name> - Check events:
kubectl describe pod -n <namespace> <pod-name> - Verify dependencies: Ensure databases, RabbitMQ, Redis are accessible
- Check resource limits: CPU/memory constraints can cause crashes
- Review recent changes: Configuration updates, deployments, network changes
Common issues: - Database connection failures (check credentials, network) - Out of memory (increase resource limits) - Image pull errors (check registry access) - Configuration errors (validate YAML syntax)
How do I add a new administrator?¶
Administrator permissions are managed through KeyCloak:
- Log into KeyCloak admin console
- Navigate to Users
- Find or create the user
- Assign admin role:
de-adminfor Discovery Environmentdata-store-adminfor Data Storesuper-adminfor platform-wide access- User logs out and back in to activate permissions
See KeyCloak documentation for role management.
API & Development¶
How do I authenticate with the Terrain API?¶
Terrain uses OAUTH 2.0 authentication via KeyCloak:
Method 1: Get token via browser (authorization code flow) 1. Direct user to KeyCloak authorization URL 2. User authorizes your application 3. Exchange authorization code for access token
Method 2: Get token via command line (password flow)
TOKEN=$(curl -s -X POST \
https://kc.cyverse.org/auth/realms/CyVerse/protocol/openid-connect/token \
-d "grant_type=password" \
-d "client_id=de-client" \
-d "username=YOUR_USERNAME" \
-d "password=YOUR_PASSWORD" \
| jq -r '.access_token')
Use token in API calls:
curl -H "Authorization: Bearer $TOKEN" \
https://de.cyverse.org/terrain/filesystem/list
See Developer Guide for complete examples.
Where can I find the complete API documentation?¶
- Live Swagger UI - Interactive API testing
- Endpoint Index - Complete endpoint reference
- API Overview - Terrain API introduction
- Error Codes - Error handling guide
All Terrain endpoints are documented with request/response schemas and examples.
How do I migrate from Tapis v2 to Tapis v3?¶
Tapis v3 introduces breaking changes. Follow the Tapis v2 to v3 Migration Guide which covers:
- API endpoint changes
- Authentication updates
- Request/response format changes
- Deprecated features
- New capabilities
Plan for code changes in your integrations.
What are the API rate limits?¶
Current rate limiting (subject to change):
- 100 requests/minute per user for authenticated endpoints
- 10 requests/minute for unauthenticated endpoints
- Burst limit: 150 requests in 10 seconds
If you exceed limits, you'll receive HTTP 429 Too Many Requests. Implement exponential backoff in your client code.
For higher limits, contact CyVerse support with your use case.
How do I upload large files via the API?¶
For large files (>100MB), use the chunked upload endpoint:
import requests
def upload_large_file(access_token, local_file, dest_path, chunk_size=10*1024*1024):
"""Upload large file in chunks."""
# 1. Initialize upload session
init_response = requests.post(
"https://de.cyverse.org/terrain/secured/fileio/upload-init",
headers={"Authorization": f"Bearer {access_token}"},
json={"dest": dest_path, "filename": "largefile.dat"}
)
upload_id = init_response.json()["upload_id"]
# 2. Upload chunks
with open(local_file, "rb") as f:
chunk_num = 0
while True:
chunk = f.read(chunk_size)
if not chunk:
break
requests.post(
f"https://de.cyverse.org/terrain/secured/fileio/upload-chunk/{upload_id}",
headers={"Authorization": f"Bearer {access_token}"},
files={"chunk": chunk},
data={"chunk_number": chunk_num}
)
chunk_num += 1
# 3. Finalize upload
requests.post(
f"https://de.cyverse.org/terrain/secured/fileio/upload-complete/{upload_id}",
headers={"Authorization": f"Bearer {access_token}"}
)
See File I/O endpoints for details.
How do I receive notifications when my job completes?¶
Enable notifications when submitting a job:
payload = {
"app_id": "app-uuid",
"name": "My Job",
"notify": True, # Enable notifications
"config": {...}
}
You'll receive notifications via: - In-app notifications in Discovery Environment - Email (if configured in user preferences) - Webhook callbacks (if configured)
See Notifications API and Callbacks API.
Can I run the API locally for development?¶
Yes! Clone the terrain repository and other services:
# Clone terrain
git clone https://github.com/cyverse-de/terrain.git
cd terrain
# Install dependencies (Clojure/Leiningen)
lein deps
# Configure local settings
cp config/terrain.properties.example config/terrain.properties
# Edit config to point to local databases
# Run locally
lein run
See repository README files for service-specific setup instructions.
Troubleshooting¶
Why can't users log in to the Discovery Environment?¶
Common authentication issues:
- KeyCloak service down - Check
kubectl get pods -n keycloak - Database connection failure - KeyCloak can't reach its database
- LDAP sync issues - If using LDAP, verify connection to LDAP server
- Browser cache/cookies - Have user clear browser cache and try incognito
- CILogon integration broken - Check CILogon service status
- Certificate issues - Verify TLS certificates are valid
Debug steps:
# Check KeyCloak logs
kubectl logs -n keycloak deployment/keycloak
# Test database connection
kubectl exec -it -n keycloak deployment/keycloak -- psql -h db-host -U keycloak
# Verify LDAP connectivity
kubectl exec -it -n keycloak deployment/keycloak -- ldapsearch -x -H ldap://ldap-server
Why is the Data Store slow?¶
Performance bottlenecks:
- iRODS server overload - Check CPU/memory on iRODS servers
- Network congestion - Verify network bandwidth between nodes
- Slow storage backend - Check underlying storage performance (NFS, SAN)
- Database issues - iRODS catalog queries slow
- Too many concurrent users - Consider scaling iRODS servers
Monitoring:
# Check iRODS server status
ils -L # List with verbose output, observe latency
# Monitor iRODS resource usage
kubectl top pods -n irods
# Check network performance
iperf3 -c irods-server
See Data Store documentation for optimization tips.
Why are jobs stuck in "Submitted" status?¶
Job execution issues:
- HTCondor/K8s not processing jobs - Check job execution platform
- Insufficient resources - No available compute nodes
- Input staging failures - Data transfer from Data Store failed
- App configuration errors - Invalid app definition
- Permission issues - User lacks access to required resources
Debug steps:
# Check HTCondor queue (for batch jobs)
condor_q -analyze <job-id>
# Check Kubernetes pods (for VICE jobs)
kubectl get pods -n vice-apps
# Check analyses service logs
kubectl logs -n de deployment/analyses
Jobs are failing with "Out of Memory" errors¶
Memory issues:
- Insufficient container limits - Increase memory limit in app definition
- User requested too little memory - Guide users to request appropriate resources
- Node exhaustion - Add more Kubernetes nodes or increase node sizes
- Memory leaks in tool - Issue with the containerized tool itself
Solutions: - Increase memory limits: Edit app in DE Apps interface - Add resource quotas: Define default/max memory for different app categories - Scale cluster: Add more compute nodes
See Kubernetes resources documentation.
ElasticSearch is using too much disk space¶
ElasticSearch index management:
# Check index sizes
curl -X GET "localhost:9200/_cat/indices?v&s=store.size:desc"
# Delete old indices
curl -X DELETE "localhost:9200/old-index-name"
# Set up index lifecycle management (ILM)
curl -X PUT "localhost:9200/_ilm/policy/cleanup-policy" -H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"delete": {
"min_age": "30d",
"actions": {"delete": {}}
}
}
}
}'
See ElasticSearch deployment documentation.
Data Management¶
How do I share data with collaborators?¶
Data sharing options:
-
Share within CyVerse - Give users/groups read or write permissions
# Via Data Store API POST /terrain/secured/filesystem/share { "paths": ["/iplant/home/user/folder"], "permissions": ["read"], "users": ["collaborator1", "collaborator2"] } -
Public links - Create anonymous access tickets
-
In DE: Right-click file/folder → Share → Create Public Link
-
DOI publication - Publish to Data Commons with permanent identifier
See Sharing API and Data Store Guide.
How do I search for data across the entire Data Store?¶
Use the search API:
curl -H "Authorization: Bearer $TOKEN" \
-X POST https://de.cyverse.org/terrain/secured/filesystem/search \
-H "Content-Type: application/json" \
-d '{"query": "genomics", "type": "file"}'
Search supports: - Full-text search in filenames and metadata - Filters by file type, date, size - User-defined metadata (AVU) searches
How do I attach metadata to data files?¶
Add Attribute-Value-Unit (AVU) metadata:
# Add metadata via API
requests.post(
"https://de.cyverse.org/terrain/secured/filesystem/metadata",
headers={"Authorization": f"Bearer {token}"},
json={
"path": "/iplant/home/user/data.csv",
"metadata": [
{"attr": "experiment_date", "value": "2024-01-15", "unit": ""},
{"attr": "sample_size", "value": "1000", "unit": "samples"}
]
}
)
Or use the DE interface: Right-click file → Metadata → Add
See Metadata API.
How do I recover deleted data?¶
Deleted files go to trash and can be restored within 30 days:
Via Discovery Environment: 1. Navigate to Trash folder 2. Select files/folders to restore 3. Click "Restore"
Via API:
curl -H "Authorization: Bearer $TOKEN" \
-X POST https://de.cyverse.org/terrain/secured/filesystem/restore \
-H "Content-Type: application/json" \
-d '{"paths": ["/iplant/trash/user/deleted-folder"]}'
After 30 days, files are permanently deleted. Implement backups for critical data.
See Restore API.
What's the maximum file size I can store?¶
File size limits:
- Single file: No hard limit (tested up to 100+ GB)
- Upload via web UI: 2 GB recommended (browser limitations)
- Upload via API/CLI: No limit
- iCommands (iRODS CLI): Recommended for files >2 GB
For very large files (>100 GB), use:
- iCommands: iput for direct iRODS uploads
- GoCommands: CyVerse's modern iRODS CLI
- Globus: For scheduled large transfers
Still Have Questions?¶
- Search this documentation - Use the search bar at the top
- Check service-specific docs - See Deployment Guide, API Reference, or Admin Guides
- GitHub Issues - Report issues or ask questions at github.com/cyverse-de
- CyVerse Support - Contact support at cyverse.org/help