Skip to content

Commit 7d42260

Browse files
committed
add genomic element lookup methods to Chromosome
1 parent f3c49bf commit 7d42260

7 files changed

Lines changed: 184 additions & 61 deletions

File tree

QtSLiM/help/SLiMHelpClasses.html

Lines changed: 47 additions & 43 deletions
Large diffs are not rendered by default.

SLiMgui/SLiMHelpClasses.rtf

Lines changed: 44 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -85,8 +85,8 @@ geneConversionEnabled => (logical$)\
8585
8686
\f4\fs20 \cf0 All of the
8787
\f3\fs18 GenomicElement
88-
\f4\fs20 objects that comprise the chromosome.
89-
\f5 \
88+
\f4\fs20 objects that comprise the chromosome\cf2 , in sorted order (not necessarily in the order in which they were defined).
89+
\f5 \cf0 \
9090
\pard\pardeftab720\li720\fi-446\ri720\sb180\sa60\partightenfactor0
9191
9292
\f3\fs18 \cf2 \expnd0\expndtw0\kerning0
@@ -532,7 +532,48 @@ It is generally recommended that the parent individual be supplied to this metho
532532
\
533533
\pard\pardeftab720\li720\fi-446\ri720\sb180\sa60\partightenfactor0
534534
535-
\f3\fs18 \cf2 \'96\'a0(integer$)setAncestralNucleotides(is\'a0sequence)\
535+
\f3\fs18 \cf2 \kerning1\expnd0\expndtw0 \'96\'a0(object<GenomicElement>)genomicElementForPosition(integer\'a0positions)\
536+
\pard\pardeftab720\li547\ri720\sb60\sa60\partightenfactor0
537+
538+
\f4\fs20 \cf2 Returns a vector of
539+
\f3\fs18 GenomicElement
540+
\f4\fs20 objects corresponding to the given vector
541+
\f3\fs18 positions
542+
\f4\fs20 , which contains base positions along the chromosome. If every position lies within a defined genomic element, the returned vector will have the same length as
543+
\f3\fs18 positions
544+
\f4\fs20 , and will correspond one-to-one with it. However, if a position in
545+
\f3\fs18 positions
546+
\f4\fs20 is not within a genomic element, no
547+
\f3\fs18 GenomicElement
548+
\f4\fs20 object will be present for it in the returned vector, and so the returned vector will no longer have the same length as
549+
\f3\fs18 positions
550+
\f4\fs20 , and will no longer correspond one-to-one with it. The method
551+
\f3\fs18 hasGenomicElementForPosition()
552+
\f4\fs20 can be used to detect this circumstance.\
553+
\pard\pardeftab720\li720\fi-446\ri720\sb180\sa60\partightenfactor0
554+
555+
\f3\fs18 \cf2 \'96\'a0(logical)hasGenomicElementForPosition(integer\'a0positions)\
556+
\pard\pardeftab720\li547\ri720\sb60\sa60\partightenfactor0
557+
558+
\f4\fs20 \cf2 Returns a
559+
\f3\fs18 logical
560+
\f4\fs20 vector corresponding to the given vector
561+
\f3\fs18 positions
562+
\f4\fs20 , which contains base positions along the chromosome. The returned vector will have the same length as
563+
\f3\fs18 positions
564+
\f4\fs20 , and will correspond one-to-one with it, containing
565+
\f3\fs18 T
566+
\f4\fs20 if the corresponding position lies inside a genomic element, or
567+
\f3\fs18 F
568+
\f4\fs20 if it does not. The method
569+
\f3\fs18 genomicElementForPosition()
570+
\f4\fs20 can be used to look up the
571+
\f3\fs18 GenomicElement
572+
\f4\fs20 objects themselves.\
573+
\pard\pardeftab720\li720\fi-446\ri720\sb180\sa60\partightenfactor0
574+
575+
\f3\fs18 \cf2 \expnd0\expndtw0\kerning0
576+
\'96\'a0(integer$)setAncestralNucleotides(is\'a0sequence)\
536577
\pard\pardeftab720\li547\ri720\sb60\sa60\partightenfactor0
537578
538579
\f4\fs20 \cf2 This method, which may be called only in nucleotide-based models, replaces the ancestral nucleotide sequence for the model. The

VERSIONS

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@ development head (in the master branch):
3131
for more details see https://doc.qt.io/qt-6/linux.html, https://doc.qt.io/qt-6/windows.html, https://doc.qt.io/qt-6/macos.html
3232
building with CMake will now default to Qt6 if CMake can find it, and fall back to Qt5 otherwise; should be automatic
3333
add a "Use OpenGL for fast display" checkbox in the Preferences panel, making it possible to disable OpenGL when it doesn't work well (#462)
34+
add genomicElementForPosition() and hasGenomicElementForPosition() methods on Chromosome
35+
policy change: genomicElements property is now in sorted order (not in order of declaration)
3436

3537

3638
version 4.2.2 (Eidos version 3.2.2):

core/chromosome.cpp

Lines changed: 79 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,12 @@ void Chromosome::InitializeDraws(void)
170170
if (genomic_elements_.size() == 0)
171171
EIDOS_TERMINATION << "ERROR (Chromosome::InitializeDraws): empty chromosome." << EidosTerminate();
172172

173+
// BCH 8/28/2024: we now sort the genomic elements here; we need them to be sorted later on anyway
174+
// I avoided doing this before because it changed the behavior, but it is hard to imagine anybody caring
175+
// Since elements must be non-overlapping, we only need to sort by start position
176+
std::sort(genomic_elements_.begin(), genomic_elements_.end(),
177+
[](GenomicElement *e1, GenomicElement *e2) { return e1->start_position_ < e2->start_position_; });
178+
173179
// determine which case we are working with: separate recombination maps for the sexes, or one map
174180
auto rec_rates_H_size = recombination_rates_H_.size();
175181
auto rec_rates_M_size = recombination_rates_M_.size();
@@ -614,12 +620,7 @@ void Chromosome::_InitializeOneMutationMap(gsl_ran_discrete_t *&p_lookup, std::v
614620
// The class we use to represent these constant-rate subregions is GESubrange, declared in chromosome.h.
615621
p_subranges.clear();
616622

617-
// We need to work with a *sorted* genomic elements vector here. We sort it internally here, rather than in
618-
// genomic_elements_, so that genomic_elements_ reflects exactly what they user gave us (mostly for backward
619-
// compatibility).
620-
std::vector<GenomicElement *> sorted_ge_vec = genomic_elements_;
621-
622-
std::sort(sorted_ge_vec.begin(), sorted_ge_vec.end(), [](GenomicElement *ge1, GenomicElement *ge2) {return ge1->start_position_ < ge2->start_position_;});
623+
// We need to work with a *sorted* genomic elements vector here. BCH 8/28/2024: That is now guaranteed.
623624

624625
// This code deals in two different currencies: *requested* mutation rates and *adjusted* mutation rates.
625626
// This stems from the fact that, beginning in SLiM 3.5, we unique mutation positions as we're generating
@@ -638,7 +639,7 @@ void Chromosome::_InitializeOneMutationMap(gsl_ran_discrete_t *&p_lookup, std::v
638639
unsigned int mutrange_index = 0;
639640
slim_position_t end_of_previous_mutrange = -1;
640641

641-
for (GenomicElement *ge_ptr : sorted_ge_vec)
642+
for (GenomicElement *ge_ptr : genomic_elements_)
642643
{
643644
GenomicElement &ge = *ge_ptr;
644645

@@ -1858,14 +1859,16 @@ EidosValue_SP Chromosome::ExecuteInstanceMethod(EidosGlobalStringID p_method_id,
18581859
{
18591860
switch (p_method_id)
18601861
{
1861-
case gID_ancestralNucleotides: return ExecuteMethod_ancestralNucleotides(p_method_id, p_arguments, p_interpreter);
1862-
case gID_setAncestralNucleotides: return ExecuteMethod_setAncestralNucleotides(p_method_id, p_arguments, p_interpreter);
1863-
case gID_setGeneConversion: return ExecuteMethod_setGeneConversion(p_method_id, p_arguments, p_interpreter);
1864-
case gID_setHotspotMap: return ExecuteMethod_setHotspotMap(p_method_id, p_arguments, p_interpreter);
1865-
case gID_setMutationRate: return ExecuteMethod_setMutationRate(p_method_id, p_arguments, p_interpreter);
1866-
case gID_setRecombinationRate: return ExecuteMethod_setRecombinationRate(p_method_id, p_arguments, p_interpreter);
1867-
case gID_drawBreakpoints: return ExecuteMethod_drawBreakpoints(p_method_id, p_arguments, p_interpreter);
1868-
default: return super::ExecuteInstanceMethod(p_method_id, p_arguments, p_interpreter);
1862+
case gID_ancestralNucleotides: return ExecuteMethod_ancestralNucleotides(p_method_id, p_arguments, p_interpreter);
1863+
case gID_genomicElementForPosition: return ExecuteMethod_genomicElementForPosition(p_method_id, p_arguments, p_interpreter);
1864+
case gID_hasGenomicElementForPosition: return ExecuteMethod_hasGenomicElementForPosition(p_method_id, p_arguments, p_interpreter);
1865+
case gID_setAncestralNucleotides: return ExecuteMethod_setAncestralNucleotides(p_method_id, p_arguments, p_interpreter);
1866+
case gID_setGeneConversion: return ExecuteMethod_setGeneConversion(p_method_id, p_arguments, p_interpreter);
1867+
case gID_setHotspotMap: return ExecuteMethod_setHotspotMap(p_method_id, p_arguments, p_interpreter);
1868+
case gID_setMutationRate: return ExecuteMethod_setMutationRate(p_method_id, p_arguments, p_interpreter);
1869+
case gID_setRecombinationRate: return ExecuteMethod_setRecombinationRate(p_method_id, p_arguments, p_interpreter);
1870+
case gID_drawBreakpoints: return ExecuteMethod_drawBreakpoints(p_method_id, p_arguments, p_interpreter);
1871+
default: return super::ExecuteInstanceMethod(p_method_id, p_arguments, p_interpreter);
18691872
}
18701873
}
18711874

@@ -2017,6 +2020,65 @@ EidosValue_SP Chromosome::ExecuteMethod_drawBreakpoints(EidosGlobalStringID p_me
20172020
return EidosValue_SP(new (gEidosValuePool->AllocateChunk()) EidosValue_Int(all_breakpoints));
20182021
}
20192022

2023+
GenomicElement *Chromosome::ElementForPosition(slim_position_t pos)
2024+
{
2025+
auto element_iter = std::lower_bound(genomic_elements_.begin(), genomic_elements_.end(), pos,
2026+
[](const GenomicElement *e, slim_position_t position) { return e->end_position_ < position; });
2027+
2028+
if (element_iter == genomic_elements_.end())
2029+
return nullptr;
2030+
2031+
GenomicElement *element = *element_iter;
2032+
2033+
if (element->start_position_ > pos)
2034+
return nullptr;
2035+
2036+
return element;
2037+
}
2038+
2039+
// ********************* (object<GenomicElement>)genomicElementForPosition(integer positions)
2040+
//
2041+
EidosValue_SP Chromosome::ExecuteMethod_genomicElementForPosition(EidosGlobalStringID p_method_id, const std::vector<EidosValue_SP> &p_arguments, EidosInterpreter &p_interpreter)
2042+
{
2043+
#pragma unused (p_method_id, p_arguments, p_interpreter)
2044+
EidosValue *positions_value = p_arguments[0].get();
2045+
int positions_count = positions_value->Count();
2046+
EidosValue_Object_SP obj_result_SP = EidosValue_Object_SP(new (gEidosValuePool->AllocateChunk()) EidosValue_Object(gSLiM_GenomicElement_Class));
2047+
EidosValue_Object *obj_result = obj_result_SP->reserve(positions_count);
2048+
const int64_t *positions_data = positions_value->IntData();
2049+
2050+
for (int pos_index = 0; pos_index < positions_count; ++pos_index)
2051+
{
2052+
int64_t pos = positions_data[pos_index];
2053+
GenomicElement *element = ElementForPosition(pos);
2054+
2055+
if (element)
2056+
obj_result->push_object_element_no_check_NORR(element);
2057+
}
2058+
2059+
return obj_result_SP;
2060+
}
2061+
2062+
// ********************* (logical)hasGenomicElementForPosition(integer positions)
2063+
//
2064+
EidosValue_SP Chromosome::ExecuteMethod_hasGenomicElementForPosition(EidosGlobalStringID p_method_id, const std::vector<EidosValue_SP> &p_arguments, EidosInterpreter &p_interpreter)
2065+
{
2066+
#pragma unused (p_method_id, p_arguments, p_interpreter)
2067+
EidosValue *positions_value = p_arguments[0].get();
2068+
int positions_count = positions_value->Count();
2069+
EidosValue_Logical_SP logical_result_SP = EidosValue_Logical_SP(new (gEidosValuePool->AllocateChunk()) EidosValue_Logical());
2070+
EidosValue_Logical *logical_result = logical_result_SP->reserve(positions_count);
2071+
const int64_t *positions_data = positions_value->IntData();
2072+
2073+
for (int pos_index = 0; pos_index < positions_count; ++pos_index)
2074+
{
2075+
int64_t pos = positions_data[pos_index];
2076+
logical_result->push_logical_no_check(ElementForPosition(pos) ? true : false);
2077+
}
2078+
2079+
return logical_result_SP;
2080+
}
2081+
20202082
// ********************* (integer$)setAncestralNucleotides(is sequence)
20212083
//
20222084
EidosValue_SP Chromosome::ExecuteMethod_setAncestralNucleotides(EidosGlobalStringID p_method_id, const std::vector<EidosValue_SP> &p_arguments, EidosInterpreter &p_interpreter)
@@ -2577,6 +2639,8 @@ const std::vector<EidosMethodSignature_CSP> *Chromosome_Class::Methods(void) con
25772639

25782640
methods->emplace_back((EidosInstanceMethodSignature *)(new EidosInstanceMethodSignature(gStr_ancestralNucleotides, kEidosValueMaskInt | kEidosValueMaskString))->AddInt_OSN(gEidosStr_start, gStaticEidosValueNULL)->AddInt_OSN(gEidosStr_end, gStaticEidosValueNULL)->AddString_OS("format", EidosValue_String_SP(new (gEidosValuePool->AllocateChunk()) EidosValue_String("string"))));
25792641
methods->emplace_back((EidosInstanceMethodSignature *)(new EidosInstanceMethodSignature(gStr_drawBreakpoints, kEidosValueMaskInt))->AddObject_OSN("parent", gSLiM_Individual_Class, gStaticEidosValueNULL)->AddInt_OSN("n", gStaticEidosValueNULL));
2642+
methods->emplace_back((EidosInstanceMethodSignature *)(new EidosInstanceMethodSignature(gStr_genomicElementForPosition, kEidosValueMaskObject, gSLiM_GenomicElement_Class))->AddInt("positions"));
2643+
methods->emplace_back((EidosInstanceMethodSignature *)(new EidosInstanceMethodSignature(gStr_hasGenomicElementForPosition, kEidosValueMaskLogical))->AddInt("positions"));
25802644
methods->emplace_back((EidosInstanceMethodSignature *)(new EidosInstanceMethodSignature(gStr_setAncestralNucleotides, kEidosValueMaskInt | kEidosValueMaskSingleton))->AddIntString("sequence"));
25812645
methods->emplace_back((EidosInstanceMethodSignature *)(new EidosInstanceMethodSignature(gStr_setGeneConversion, kEidosValueMaskVOID))->AddNumeric_S("nonCrossoverFraction")->AddNumeric_S("meanLength")->AddNumeric_S("simpleConversionFraction")->AddNumeric_OS("bias", gStaticEidosValue_Integer0));
25822646
methods->emplace_back((EidosInstanceMethodSignature *)(new EidosInstanceMethodSignature(gStr_setHotspotMap, kEidosValueMaskVOID))->AddNumeric("multipliers")->AddInt_ON("ends", gStaticEidosValueNULL)->AddString_OS("sex", gStaticEidosValue_StringAsterisk));

core/chromosome.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ class Chromosome : public EidosDictionaryRetained
6060
private:
6161
#endif
6262

63+
// This vector contains all the genomic elements for this chromosome. It is in sorted order once initialization is complete.
6364
std::vector<GenomicElement *> genomic_elements_; // OWNED POINTERS: genomic elements belong to the chromosome
6465

6566
// We now allow different recombination maps for males and females, optionally. Unfortunately, this means we have a bit of an
@@ -229,6 +230,9 @@ class Chromosome : public EidosDictionaryRetained
229230
double &p_both_0, double &p_both_0_OR_mut_0_break_non0, double &p_both_0_OR_mut_0_break_non0_OR_mut_non0_break_0);
230231
#endif
231232

233+
// looking up genomic elements quickly, by binary search
234+
GenomicElement *ElementForPosition(slim_position_t pos);
235+
232236
// internal methods for throwing errors from inline functions when assumptions about the configuration of maps are violated
233237
void MutationMapConfigError(void) const __attribute__((__noreturn__)) __attribute__((cold)) __attribute__((analyzer_noreturn));
234238
void RecombinationMapConfigError(void) const __attribute__((__noreturn__)) __attribute__((cold)) __attribute__((analyzer_noreturn));
@@ -250,6 +254,8 @@ class Chromosome : public EidosDictionaryRetained
250254

251255
virtual EidosValue_SP ExecuteInstanceMethod(EidosGlobalStringID p_method_id, const std::vector<EidosValue_SP> &p_arguments, EidosInterpreter &p_interpreter) override;
252256
EidosValue_SP ExecuteMethod_ancestralNucleotides(EidosGlobalStringID p_method_id, const std::vector<EidosValue_SP> &p_arguments, EidosInterpreter &p_interpreter);
257+
EidosValue_SP ExecuteMethod_genomicElementForPosition(EidosGlobalStringID p_method_id, const std::vector<EidosValue_SP> &p_arguments, EidosInterpreter &p_interpreter);
258+
EidosValue_SP ExecuteMethod_hasGenomicElementForPosition(EidosGlobalStringID p_method_id, const std::vector<EidosValue_SP> &p_arguments, EidosInterpreter &p_interpreter);
253259
EidosValue_SP ExecuteMethod_setAncestralNucleotides(EidosGlobalStringID p_method_id, const std::vector<EidosValue_SP> &p_arguments, EidosInterpreter &p_interpreter);
254260
EidosValue_SP ExecuteMethod_setGeneConversion(EidosGlobalStringID p_method_id, const std::vector<EidosValue_SP> &p_arguments, EidosInterpreter &p_interpreter);
255261
EidosValue_SP ExecuteMethod_setHotspotMap(EidosGlobalStringID p_method_id, const std::vector<EidosValue_SP> &p_arguments, EidosInterpreter &p_interpreter);

core/slim_globals.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1271,6 +1271,8 @@ const std::string &gStr_maxDistance = EidosRegisteredString("maxDistance", gID_m
12711271
// mostly method names
12721272
const std::string &gStr_ancestralNucleotides = EidosRegisteredString("ancestralNucleotides", gID_ancestralNucleotides);
12731273
const std::string &gStr_nucleotides = EidosRegisteredString("nucleotides", gID_nucleotides);
1274+
const std::string &gStr_genomicElementForPosition = EidosRegisteredString("genomicElementForPosition", gID_genomicElementForPosition);
1275+
const std::string &gStr_hasGenomicElementForPosition = EidosRegisteredString("hasGenomicElementForPosition", gID_hasGenomicElementForPosition);
12741276
const std::string &gStr_setAncestralNucleotides = EidosRegisteredString("setAncestralNucleotides", gID_setAncestralNucleotides);
12751277
const std::string &gStr_setGeneConversion = EidosRegisteredString("setGeneConversion", gID_setGeneConversion);
12761278
const std::string &gStr_setHotspotMap = EidosRegisteredString("setHotspotMap", gID_setHotspotMap);

core/slim_globals.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -865,6 +865,8 @@ extern const std::string &gStr_maxDistance;
865865

866866
extern const std::string &gStr_ancestralNucleotides;
867867
extern const std::string &gStr_nucleotides;
868+
extern const std::string &gStr_genomicElementForPosition;
869+
extern const std::string &gStr_hasGenomicElementForPosition;
868870
extern const std::string &gStr_setAncestralNucleotides;
869871
extern const std::string &gStr_setGeneConversion;
870872
extern const std::string &gStr_setHotspotMap;
@@ -1283,6 +1285,8 @@ enum _SLiMGlobalStringID : int {
12831285

12841286
gID_ancestralNucleotides,
12851287
gID_nucleotides,
1288+
gID_genomicElementForPosition,
1289+
gID_hasGenomicElementForPosition,
12861290
gID_setAncestralNucleotides,
12871291
gID_setGeneConversion,
12881292
gID_setHotspotMap,

0 commit comments

Comments
 (0)