Dataset is an abstraction of the local file system.
Users can add their local paths into this system to easily access the data inside.
The basic concept is to treat a data file as a property of a
The following docs show how easy it is to interact with the data in this system.
Assuming that you have some data files
data1.pkl.z under dir /set1
data2.pkl.z under dir /set2.
The following codes will create a
Dataset object containing all available files under /set1 and /set2.
>>> from xenonpy.datatools import Dataset >>> dataset = Dataset('/set1', '/set2') >>> dataset <Dataset> includes: "data1": /set1/data1.pd.xz "data2": /set2/data2.pd.xz
Now, you can retrieve data by their name like this:
What the code did is that, the
dataset loaded a file with name
data1.pd.xz from /set1 or /set2.
In this case, the /set1/data1.pd.xz was loaded.
It is important to note that we called a property named
dataframe before we load
data1 in order to let
dataset know that it is loading a
pandas.DataFrame object file using the
Currently, 4 loaders are available out-of-the-box. The information of built-in loaders is summarised as below.
pandas.DataFrame object file
common pickled files
The default loader is
dataframe. This means that if you want to load a pandas.DataFrame object, you can omit the
The following code exactly does the same work as explained above:
You can also specify the default loader by setting the
>>> dataset = Dataset('set1', 'set2', backend='csv') >>> dataset.data1 # this will load '/set1/data1.csv'
XenonPy also uses this system to provide some built-in data.
Currently, two sets of element-level property data are available out-of-the-box (
elements_completed (imputed version of
These data were collected from mendeleev, pymatgen, CRC Hand Book and Magpie.
To know the details of
elements_completed, see Data access
Use the following codes to load
>>> from xenonpy.datatools import preset >>> preset.elements >>> preset.elements_completed
If you will get a
file not exist error, please run the following code to sync your local dataset.
>>> from xenonpy.datatools import preset >>> preset.sync('elements') >>> preset.sync('elements_completed')
These are still some advanced uses of
preset. For more details, see tutorial/1-dataset:Advance.
Also see the jupyter files at:
For implementation details, you can check out our sample codes: