[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ReadsHtsgetData source, refactor HtsgetReader #6662

Draft
wants to merge 26 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
a476ce7
Implement ReadsHtsgetData source, refactor HtsgetReader
andersleung Jun 16, 2020
ff50eec
Update HtsgetReader command line tests
andersleung Jun 16, 2020
aa4d48a
Commit missing files
andersleung Jun 16, 2020
ba2dd80
Add javadoc, further refactoring to allow easier testing, lazily stre…
andersleung Jun 17, 2020
0cb9530
Use MergingSamRecordIterator internally to ensure proper ordering of …
andersleung Jun 24, 2020
dce337f
Close out all iterators, request headers asynchronously, minor refact…
andersleung Jun 30, 2020
edda05c
Perform map insertion outside of future to avoid concurrent modification
andersleung Jun 30, 2020
3415528
Address PR comments
andersleung Jul 6, 2020
74f7d70
Fix test
andersleung Jul 6, 2020
890da67
WIP Start adding ReadsHtsgetDataSource tests
andersleung Jul 10, 2020
c0cc6ee
Merge branch 'master' into readsHtsgetDataSource
andersleung Jul 10, 2020
8afa04b
WIP fix broken tests
andersleung Jul 13, 2020
946bb4b
WIP Add tests for filtering duplicates, try 127.0.0.1 instead of loca…
andersleung Jul 13, 2020
100b0a4
WIP Try spawning sibling docker container for refserver
andersleung Jul 13, 2020
c6a36cf
WIP try 0.0.0.0:3000 instead of 127.0.0.1
andersleung Jul 13, 2020
1022cbe
WIP Specify 127.0.0.1 in port mapping when running docker
andersleung Jul 13, 2020
062d9ca
WIP configure refserver to listen on 0.0.0.0
andersleung Jul 13, 2020
f166712
WIP configure all IP addresses to 0.0.0.0:3000
andersleung Jul 14, 2020
37eb57d
WIP run test container with net=host
andersleung Jul 14, 2020
1e77370
Add comment to .travis.yml
andersleung Jul 14, 2020
f71ea84
Add end to end tests in PrintReads using htsget source
andersleung Jul 24, 2020
7967843
Merge branch 'master' into readsHtsgetDataSource
andersleung Jul 24, 2020
c437c29
WIP use htsjdk HtsgetBAMFileReader, refactor ReadsPathDataSource to u…
andersleung Aug 18, 2020
27e7dc7
Merge branch 'master' into readsHtsgetDataSource
andersleung Aug 18, 2020
a32a9e3
Add readme to htsgetScripts, move htsget_config.json
andersleung Aug 19, 2020
ea540d1
Remove ReadsHtsgetDataSource and GATK versions of htsget classes
andersleung Aug 19, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
WIP Start adding ReadsHtsgetDataSource tests
  • Loading branch information
andersleung committed Jul 10, 2020
commit 890da677da65d5d6b008f55c4565cf8ba10afc21
4 changes: 4 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ group: travis_lts
git:
depth: 9999999
lfs_skip_smudge: true
services:
- docker
jdk:
- openjdk8
env:
Expand Down Expand Up @@ -101,6 +103,8 @@ before_install:
# http://docs.travis-ci.com/user/database-setup/#MySQL
- sudo /etc/init.d/mysql stop
- sudo /etc/init.d/postgresql stop
- sudo bash scripts/htsgetScripts/launchDocker.sh

install:
- if [[ $TRAVIS_SECURE_ENV_VARS == false && $TEST_TYPE == cloud ]]; then
echo "Can't run cloud tests without keys so don't bother building";
Expand Down
11 changes: 11 additions & 0 deletions scripts/htsgetScripts/launchDocker.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
WORKING_DIR=/home/travis/build/broadinstitute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment to this script explaining what it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed file and added a README to the script folder


echo "pwd is `pwd`"
ls

docker pull ga4gh/htsget-ref:1.1.0
docker container run -d --name htsget-server -p 3000:3000 --env HTSGET_PORT=3000 --env HTSGET_HOST=http://localhost:3000 -v $WORKING_DIR/gatk/src/test/resources/org/broadinstitute/hellbender/engine:/data ga4gh/htsget-ref:1.1.0 ./htsref -config /data/htsget_config.json

echo "Running docker containers"

docker container ls -a
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@
* Manages traversals and queries over sources of reads which are accessible via {@link GATKPath}s pointing to a file
* behind an htsget server
* (for now, BAM/CRAM files only).
*
* <p>
* Two basic operations are available:
*
* <p>
* -Iteration over all reads, optionally restricted to reads that overlap a set of intervals
* -Targeted queries by one interval at a time
*/
Expand All @@ -48,18 +48,18 @@ public final class ReadsHtsgetDataSource implements ReadsDataSource {
/**
* Only reads that overlap these intervals (and unmapped reads, if {@link #traverseUnmapped} is set) will be returned
* during a full iteration. Null if iteration is unbounded.
*
* <p>
* Individual queries are unaffected by these intervals -- only traversals initiated via {@link #iterator} are affected.
*/
private List<SimpleInterval> intervals;

/**
* If true, restrict traversals to unmapped reads (and reads overlapping any {@link #intervals}, if set).
* False if iteration is unbounded or bounded only by our {@link #intervals}.
*
* <p>
* Note that this setting covers only unmapped reads that have no position -- unmapped reads that are assigned the
* position of their mates will be returned by queries overlapping that position.
*
* <p>
* Individual queries are unaffected by this setting -- only traversals initiated via {@link #iterator} are affected.
*/
private boolean traverseUnmapped;
Expand Down Expand Up @@ -110,7 +110,7 @@ public ReadsHtsgetDataSource(final List<GATKPath> sources) {
/**
* Initialize this data source with a single SAM/BAM file and a custom SamReaderFactory
*
* @param source path to SAM/BAM file, not null.
* @param source path to SAM/BAM file, not null.
* @param customSamReaderFactory SamReaderFactory to use, if null a default factory with no reference and validation
* stringency SILENT is used.
*/
Expand All @@ -121,7 +121,7 @@ public ReadsHtsgetDataSource(final GATKPath source, final SamReaderFactory custo
/**
* Initialize this data source with multiple SAM/BAM files and a custom SamReaderFactory
*
* @param sources path to SAM/BAM file, not null.
* @param sources path to SAM/BAM file, not null.
* @param customSamReaderFactory SamReaderFactory to use, if null a default factory with no reference and validation
* stringency SILENT is used.
*/
Expand Down Expand Up @@ -197,23 +197,23 @@ public ReadsHtsgetDataSource(final List<GATKPath> sources, final SamReaderFactor
/**
* Restricts a traversal of this data source via {@link #iterator} to only return reads that overlap the given intervals,
* and to unmapped reads if specified.
*
* <p>
* Calls to {@link #query} are not affected by this method.
*
* @param intervals Our next full traversal will return reads overlapping these intervals
* @param intervals Our next full traversal will return reads overlapping these intervals
* @param traverseUnmapped Our next full traversal will return unmapped reads (this affects only unmapped reads that
* have no position -- unmapped reads that have the position of their mapped mates will be
* included if the interval overlapping that position is included).
*/
@Override
public void setTraversalBounds(final List<SimpleInterval> intervals, final boolean traverseUnmapped) {
this.intervals = intervals != null && ! intervals.isEmpty() ? intervals : null;
this.intervals = intervals != null && !intervals.isEmpty() ? intervals : null;
this.traverseUnmapped = traverseUnmapped;
}

/**
* @return True if traversals initiated via {@link #iterator} will be restricted to reads that overlap intervals
* as configured via {@link #setTraversalBounds}, otherwise false
* as configured via {@link #setTraversalBounds}, otherwise false
*/
@Override
public boolean traversalIsBounded() {
Expand All @@ -222,6 +222,7 @@ public boolean traversalIsBounded() {

/**
* This data source can be queried even without index files
*
* @return always true
*/
@Override
Expand All @@ -234,7 +235,7 @@ public boolean isQueryableByInterval() {
* iteration is limited to reads that overlap that set of intervals.
*
* @return An iterator over the reads in this data source, limited to reads that overlap the intervals supplied
* via {@link #setTraversalBounds} (if intervals were provided)
* via {@link #setTraversalBounds} (if intervals were provided)
*/
@Override
@Nonnull
Expand All @@ -256,7 +257,7 @@ public Iterator<GATKRead> query(final SimpleInterval interval) {

/**
* @return An iterator over just the unmapped reads with no assigned position. This operation is not affected
* by prior calls to {@link #setTraversalBounds}.
* by prior calls to {@link #setTraversalBounds}.
*/
@Override
public Iterator<GATKRead> queryUnmapped() {
Expand Down Expand Up @@ -388,7 +389,8 @@ public boolean filterOut(final SAMRecord first, final SAMRecord second) {

/**
* Wrap an iterator to allow us to free the backing SamReader without holding onto an explicit reference to it
* @param iterator the iterator to wrap
*
* @param iterator the iterator to wrap
* @param samReader the SamReader to close once the iterator has been used up
* @return a wrapped CloseableIterator
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,8 @@ public class HtsgetRequest {

private static final ObjectMapper mapper = getObjectMapper();

final private URI endpoint;
final private String id;
private final URI endpoint;
private final String id;

// Query parameters
private HtsgetFormat format;
Expand All @@ -59,16 +59,11 @@ public HtsgetRequest(final URI endpoint, final String id) {
}

public HtsgetRequest(final GATKPath source) {
try {
final URI sourceURI = source.getURI();
this.endpoint = new URI("//" + sourceURI.getHost());
this.id = sourceURI.getPath();
this.fields = EnumSet.noneOf(HtsgetRequestField.class);
this.tags = new HashSet<>();
this.notags = new HashSet<>();
} catch (final URISyntaxException e) {
throw new UserException(source.toString(), e);
}
this.endpoint = source.getURI();
this.id = "";
andersleung marked this conversation as resolved.
Show resolved Hide resolved
this.fields = EnumSet.noneOf(HtsgetRequestField.class);
this.tags = new HashSet<>();
this.notags = new HashSet<>();
}

public URI getEndpoint() {
Expand Down Expand Up @@ -221,6 +216,7 @@ public URI toURI() {
this.validateRequest();
final UriBuilder builder = UriBuilder.fromUri(this.endpoint)
.scheme("http")
.port(this.endpoint.getPort())
.path(this.id);

if (this.format != null) {
Expand Down
Loading