Skip to content

Commit a33e43b

Browse files
authored
Merge pull request #107 from dh-tech/feature/convert-hijri
Add support for converting from Hijri calendar to undate and undate interval
2 parents 57f8f66 + 4372b23 commit a33e43b

31 files changed

Lines changed: 1265 additions & 118 deletions

.github/workflows/unit_tests.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ on:
88
- 'undate/**'
99
- 'tests/**'
1010
pull_request:
11+
branches:
12+
- "**"
1113

1214
env:
1315
# python version used to calculate and submit code coverage

README.md

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ An `UndateInterval` is a date range between two `Undate` objects. Intervals can
140140
```
141141

142142
You can initialize `Undate` or `UndateInterval` objects by parsing a date string with a specific converter, and you can also output an `Undate` object in those formats.
143-
Available converters are "ISO8601" and "EDTF" (but only)
143+
Currently available converters are "ISO8601" and "EDTF" and supported calendars.
144144

145145
```python
146146
>>> from undate import Undate
@@ -156,6 +156,33 @@ Available converters are "ISO8601" and "EDTF" (but only)
156156
<UndateInterval 1800/1900>
157157
```
158158

159+
### Calendars
160+
161+
All `Undate` objects are calendar aware, and date converters include support for parsing and working with dates from other calendars. The Gregorian calendar is used by default; currently `undate` supports the Hijri Islamic calendar and the Anno Mundi Hebrew calendar based on calendar convertion logic implemented in the [convertdate](https://convertdate.readthedocs.io/en/latest/)package.
162+
163+
Dates are stored with the year, month, day and appropriate precision for the original calendar; internally, earliest and latest dates are calculated in Gregorian / Proleptic Gregorian calendar for standardized comparison across dates from different calendars.
164+
165+
```python
166+
>>> from undate import Undate
167+
>>> tammuz4816 = Undate.parse("26 Tammuz 4816", "Hebrew")
168+
>>> tammuz4816
169+
<Undate '26 Tammuz 4816 Anno Mundi' 4816-04-26 (Hebrew)>
170+
>>> rajab495 = Undate.parse("Rajab 495", "Hijri")
171+
>>> rajab495
172+
<Undate 'Rajab 495 Hijrī' 0495-07 (Hijri)>
173+
>>> y2k = Undate.parse("2001", "EDTF")
174+
>>> y2k
175+
<Undate 2001 (Gregorian)>
176+
>>> [str(d.earliest) for d in [rajab495, tammuz4816, y2k]]
177+
['1102-04-28', '1056-07-17', '2001-01-01']
178+
>>> [str(d.precision) for d in [rajab495, tammuz4816, y2k]]
179+
['MONTH', 'DAY', 'YEAR']
180+
>>> sorted([rajab495, tammuz4816, y2k])
181+
[<Undate '26 Tammuz 4816 Anno Mundi' 4816-04-26 (Hebrew)>, <Undate 'Rajab 495 Hijrī' 0495-07 (Hijri)>, <Undate 2001 (Gregorian)>]
182+
```
183+
184+
* * *
185+
159186
For more examples, refer to the [example notebooks](https://github.com/dh-tech/undate-python/tree/main/examples/notebooks/) included in this repository.
160187

161188
## Documentation

docs/undate/converters.rst

Lines changed: 29 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,25 @@
11
Converters
22
==========
33

4+
Overview
5+
--------
6+
47
.. automodule:: undate.converters.base
58
:members:
69
:undoc-members:
710

11+
Formats
12+
--------
13+
814
ISO8601
9-
-------
15+
^^^^^^^
1016

1117
.. automodule:: undate.converters.iso8601
1218
:members:
1319
:undoc-members:
1420

1521
Extended Date-Time Format (EDTF)
16-
--------------------------------
22+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1723

1824
.. automodule:: undate.converters.edtf.converter
1925
:members:
@@ -23,8 +29,25 @@ Extended Date-Time Format (EDTF)
2329
:members:
2430
:undoc-members:
2531

26-
.. transformer is more of an internal, probably doesn't make sense to include
27-
.. .. automodule:: undate.converters.edtf.transformer
28-
.. :members:
29-
.. :undoc-members:
32+
33+
Calendars
34+
---------
35+
36+
Gregorian
37+
^^^^^^^^^
38+
39+
.. automodule:: undate.converters.calendars.gregorian
40+
:members:
41+
42+
Hijri (Islamic calendar)
43+
^^^^^^^^^^^^^^^^^^^^^^^^
44+
45+
.. automodule:: undate.converters.calendars.hijri.converter
46+
:members:
47+
48+
Anno Mundi (Hebrew calendar)
49+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
50+
51+
.. automodule:: undate.converters.calendars.hebrew.converter
52+
:members:
3053

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ readme = "README.md"
99
license = { text = "Apache-2" }
1010
requires-python = ">= 3.9"
1111
dynamic = ["version"]
12-
dependencies = ["lark", "numpy"]
12+
dependencies = ["lark[interegular]", "numpy", "convertdate", "strenum; python_version < '3.11'"]
1313
authors = [
1414
{ name = "Rebecca Sutton Koeser" },
1515
{ name = "Cole Crawford" },

src/undate/converters/base.py

Lines changed: 74 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
"""
2-
:class:`undate.converters.BaseDateConverter` provides a base class for
2+
:class:`~undate.converters.BaseDateConverter` provides a base class for
33
implementing date converters, which can provide support for
4-
parsing and generating dates in different formats and also converting
5-
dates between different calendars.
4+
parsing and generating dates in different formats.
5+
The converter subclass :class:`undate.converters.BaseCalendarConverter`
6+
provides additional functionaly needed for calendar conversion.
67
7-
To add support for a new date format or calendar conversion:
8+
To add support for a new date converter:
89
910
- Create a new file under ``undate/converters/``
1011
- For converters with sufficient complexity, you may want to create a submodule;
@@ -18,6 +19,26 @@
1819
The new subclass should be loaded automatically and included in the converters
1920
returned by :meth:`BaseDateConverter.available_converters`
2021
22+
To add support for a new calendar converter:
23+
24+
- Create a new file under ``undate/converters/calendars/``
25+
- For converters with sufficient complexity, you may want to create a submodule;
26+
see ``undate.converters.calendars.hijri`` for an example.
27+
- Extend ``BaseCalendarConverter`` and implement ``parse`` and ``to_string``
28+
formatter methods as desired/appropriate for your converter as well as the
29+
additional methods for ``max_month``, ``max_day``, and convertion ``to_gregorian``
30+
calendar.
31+
- Import your calendar in ``undate/converters/calendars/__init__.py`` and include in `__all__``
32+
- Add unit tests for the new calendar logic under ``tests/test_converters/calendars/``
33+
- Add the new calendar to the ``Calendar`` enum of supported calendars in
34+
``undate/undate.py`` and confirm that the `get_converter` method loads your
35+
calendar converter correctly (an existing unit test should cover this).
36+
- Consider creating a notebook to demonstrate the use of the calendar
37+
converter.
38+
39+
Calendar converter subclasses are also automatically loaded and included
40+
in the list of available converters.
41+
2142
-------------------
2243
"""
2344

@@ -90,6 +111,54 @@ def available_converters(cls) -> Dict[str, Type["BaseDateConverter"]]:
90111
"""
91112
Dictionary of available converters keyed on name.
92113
"""
114+
return {c.name: c for c in cls.subclasses()} # type: ignore
115+
116+
@classmethod
117+
def subclasses(cls) -> list[Type["BaseDateConverter"]]:
118+
"""
119+
List of available converters classes. Includes calendar convert
120+
subclasses.
121+
"""
93122
# ensure undate converters are imported
94123
cls.import_converters()
95-
return {c.name: c for c in cls.__subclasses__()} # type: ignore
124+
125+
# find all direct subclasses, excluding base calendar converter
126+
subclasses = cls.__subclasses__()
127+
subclasses.remove(BaseCalendarConverter)
128+
# add all subclasses of calendar converter base class
129+
subclasses.extend(BaseCalendarConverter.__subclasses__())
130+
return subclasses
131+
132+
133+
class BaseCalendarConverter(BaseDateConverter):
134+
"""Base class for calendar converters, with additional methods required
135+
for calendars."""
136+
137+
#: Converter name. Subclasses must define a unique name.
138+
name: str = "Base Calendar Converter"
139+
140+
def min_month(self) -> int:
141+
"""Smallest numeric month for this calendar."""
142+
raise NotImplementedError
143+
144+
def max_month(self, year: int) -> int:
145+
"""Maximum numeric month for this calendar"""
146+
raise NotImplementedError
147+
148+
def first_month(self) -> int:
149+
"""first month in this calendar; by default, returns :meth:`min_month`."""
150+
return self.min_month()
151+
152+
def last_month(self, year: int) -> int:
153+
"""last month in this calendar; by default, returns :meth:`max_month`."""
154+
return self.max_month(year)
155+
156+
def max_day(self, year: int, month: int) -> int:
157+
"""maximum numeric day for the specified year and month in this calendar"""
158+
raise NotImplementedError
159+
160+
def to_gregorian(self, year, month, day) -> tuple[int, int, int]:
161+
"""Convert a date for this calendar specified by numeric year, month, and day,
162+
into the Gregorian equivalent date. Should return a tuple of year, month, day.
163+
"""
164+
raise NotImplementedError
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
from undate.converters.calendars.gregorian import GregorianDateConverter
2+
from undate.converters.calendars.hijri import HijriDateConverter
3+
from undate.converters.calendars.hebrew import HebrewDateConverter
4+
5+
__all__ = ["HijriDateConverter", "GregorianDateConverter", "HebrewDateConverter"]
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
from calendar import monthrange
2+
3+
from undate.converters.base import BaseCalendarConverter
4+
5+
6+
class GregorianDateConverter(BaseCalendarConverter):
7+
"""
8+
Calendar converter class for Gregorian calendar.
9+
"""
10+
11+
#: converter name: Gregorian
12+
name: str = "Gregorian"
13+
#: calendar
14+
calendar_name: str = "Gregorian"
15+
16+
#: known non-leap year
17+
NON_LEAP_YEAR: int = 2022
18+
19+
def min_month(self) -> int:
20+
"""First month for the Gregorian calendar."""
21+
return 1
22+
23+
def max_month(self, year: int) -> int:
24+
"""maximum numeric month for the specified year in the Gregorian calendar"""
25+
return 12
26+
27+
def max_day(self, year: int, month: int) -> int:
28+
"""maximum numeric day for the specified year and month in this calendar"""
29+
# if month is known, use that to calculate
30+
if month:
31+
# if year is known, use it; otherwise use a known non-leap year
32+
# (only matters for February)
33+
year = year or self.NON_LEAP_YEAR
34+
35+
# Use monthrange from python builtin calendar module.
36+
# returns first day of the month and number of days in the month
37+
# for the specified year and month.
38+
_, max_day = monthrange(year, month)
39+
else:
40+
# if year and month are unknown, return maximum possible
41+
max_day = 31
42+
43+
return max_day
44+
45+
def to_gregorian(self, year, month, day) -> tuple[int, int, int]:
46+
"""Convert to Gregorian date. This returns the specified by year, month,
47+
and day unchanged, but is provided for consistency since all calendar
48+
converters need to support conversion to Gregorian calendar for
49+
a common point of comparison.
50+
"""
51+
return (year, month, day)
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
from undate.converters.calendars.hebrew.converter import HebrewDateConverter
2+
3+
__all__ = ["HebrewDateConverter"]
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
from typing import Union
2+
3+
from convertdate import hebrew # type: ignore
4+
from lark.exceptions import UnexpectedCharacters
5+
6+
from undate.converters.base import BaseCalendarConverter
7+
from undate.converters.calendars.hebrew.parser import hebrew_parser
8+
from undate.converters.calendars.hebrew.transformer import HebrewDateTransformer
9+
from undate.undate import Undate, UndateInterval
10+
11+
12+
class HebrewDateConverter(BaseCalendarConverter):
13+
"""
14+
Converter for Hebrew Anno Mundicalendar.
15+
16+
Support for parsing Anno Mundi dates and converting to Undate and UndateInterval
17+
objects in the Gregorian calendar.
18+
"""
19+
20+
#: converter name: Hebrew
21+
name: str = "Hebrew"
22+
calendar_name: str = "Anno Mundi"
23+
24+
def __init__(self):
25+
self.transformer = HebrewDateTransformer()
26+
27+
def min_month(self) -> int:
28+
"""Smallest numeric month for this calendar."""
29+
return 1
30+
31+
def max_month(self, year: int) -> int:
32+
"""Maximum numeric month for this calendar. In Hebrew calendar, this is 12 or 13
33+
depending on whether it is a leap year."""
34+
return hebrew.year_months(year)
35+
36+
def first_month(self) -> int:
37+
"""First month in this calendar. The Hebrew civil year starts in Tishri."""
38+
return hebrew.TISHRI
39+
40+
def last_month(self, year: int) -> int:
41+
"""Last month in this calendar. Hebrew civil year starts in Tishri,
42+
Elul is the month before Tishri."""
43+
return hebrew.ELUL
44+
45+
def max_day(self, year: int, month: int) -> int:
46+
"""maximum numeric day for the specified year and month in this calendar"""
47+
# NOTE: unreleased v2.4.1 of convertdate standardizes month_days to month_length
48+
return hebrew.month_days(year, month)
49+
50+
def to_gregorian(self, year: int, month: int, day: int) -> tuple[int, int, int]:
51+
"""Convert a Hebrew date, specified by year, month, and day,
52+
to the Gregorian equivalent date. Returns a tuple of year, month, day.
53+
"""
54+
return hebrew.to_gregorian(year, month, day)
55+
56+
def parse(self, value: str) -> Union[Undate, UndateInterval]:
57+
"""
58+
Parse a Hebrew date string and return an :class:`~undate.undate.Undate` or
59+
:class:`~undate.undate.UndateInterval`.
60+
The Hebrew date string is preserved in the undate label.
61+
"""
62+
if not value:
63+
raise ValueError("Parsing empty string is not supported")
64+
65+
# parse the input string, then transform to undate object
66+
try:
67+
# parse the string with our Hebrew date parser
68+
parsetree = hebrew_parser.parse(value)
69+
# transform the parse tree into an undate or undate interval
70+
undate_obj = self.transformer.transform(parsetree)
71+
# set the original date as a label, with the calendar name
72+
undate_obj.label = f"{value} {self.calendar_name}"
73+
return undate_obj
74+
except UnexpectedCharacters as err:
75+
raise ValueError(f"Could not parse '{value}' as a Hebrew date") from err
76+
77+
# do we need to support conversion the other direction?
78+
# i.e., generate a Hebrew date from an abitrary undate or undate interval?

0 commit comments

Comments
 (0)