Search Engines
Search engines like Lucenia, OpenSearch, and Elasticsearch execute custom scoring scripts that need sandboxing.
The Problemβ
Search engines allow users to provide custom scoring expressions or scripts:
- Scripts in Lucenia and OpenSearch/Elasticsearch
- Custom scorers in Lucene
- User-defined functions for ranking
Without sandboxing, these scripts can:
- Access arbitrary files
- Make network calls
- Execute malicious code
The Solutionβ
jGuard restricts script execution to safe operations:
security module custom.scoring {
// Read index data only
entitle module to fs.read(index, "**");
// System properties for numeric operations
entitle module to system.property.read("java.lang.*");
// No network
// No file writes
// No threads
// No native code
}
Architectureβ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Search Engine β
β β
β ββββββββββββββββββ β
β β Query Parser β β
β βββββββββ¬βββββββββ β
β β β
β βΌ β
β ββββββββββββββββββ ββββββββββββββββββββ β
β β Script Engine βββββΆβ Custom Scorer β β
β β β β (sandboxed) β β
β β jGuard policy β β β’ fs.read only β β
β β enforced β β β’ no network β β
β ββββββββββββββββββ ββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββ β
β β Results β β
β ββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Example: Lucene Custom Scorerβ
Policyβ
security module com.example.search.scoring {
// Read index segments
entitle module to fs.read(index, "**/_*.cfs");
entitle module to fs.read(index, "**/_*.cfe");
entitle module to fs.read(index, "**/segments_*");
// System properties for math operations
entitle module to system.property.read;
// Strictly no:
// - network.outbound
// - fs.write
// - threads.create
// - native.load
}
Java Codeβ
public class SecureScorer extends Scorer {
@Override
public float score() throws IOException {
// Safe: reading from index (fs.read allowed)
Document doc = reader.document(docId);
float boost = parseBoost(doc);
// BLOCKED: This would be denied
// new URL("http://evil.com").openStream();
return baseScore * boost;
}
}
Lucenia Integrationβ
Lucenia uses jGuard for its scripting sandbox:
security module io.lucenia.script {
// Script can read index and codec files
entitle module to fs.read(data, "**");
// Allow system properties for numeric types
entitle module to system.property.read("java.**");
// No dangerous operations
deny module to network.outbound;
deny module to fs.write;
deny module to threads.create;
deny module to native.load;
}
OpenSearch/Elasticsearch Patternsβ
For search systems with plugin architectures:
// Core search module
security module search.core {
entitle module to fs.read(index, "**");
entitle module to fs.write(index, "**");
entitle module to network.listen(9200);
entitle module to threads.create;
}
// Scripting plugin - restricted
security module search.scripting {
entitle module to fs.read(index, "**");
entitle module to system.property.read;
// No writes, no network, no threads
}
// Network plugin - network only
security module search.network {
entitle module to network.outbound("*.internal", 9300);
entitle module to network.listen(9300);
}
Query-Time Securityβ
Different query types can have different restrictions:
// Simple queries - minimal access
security module query.simple {
entitle module to fs.read(index, "**");
}
// Aggregation queries - thread pool allowed
security module query.aggregation {
entitle module to fs.read(index, "**");
entitle module to threads.create;
}
// Admin queries - full access
security module query.admin {
entitle module to fs.read(index, "**");
entitle module to fs.write(index, "**");
entitle module to threads.create;
}
Best Practicesβ
- Separate scoring from indexing - Different security requirements
- Deny network for scripts - Scripts shouldn't phone home
- Deny file writes - Scripts read data, don't modify it
- Audit unknown scripts - Run in audit mode first
- Use external policies - Override script permissions at deployment