Source code for posydon.grids.psygrid

"""Module defining the PSyGrid, the main grid object of POSYDON.

I. CREATING A GRID
------------------
a) One-liner for creating a grid without loading it in memory:

    mygrid = PSyGrid(verbose=True).create("mysimulations/", "mygrids/grid1.h5")

b) Create a grid and keep it for processing:

    mygrid = PSyGrid()
    mygrid.create("MESA_simulations/ZAMS_grid/", "stored_grids/ZAMS_grid.h5")
    print(mygrid)

c) Store only metadata and initial/final values of runs:

    mygrid = PSyGrid().create(in_dir_path, out_file_path, slim=True)

d) Various options (see definition of `create` and `GRIDPROPERTIES` for more):

    mygrid = PSyGrid().create(in_dir_path, out_file_path,
                              overwrite=True,
                              warn="suppress",
                              max_number_of_runs=1000,
                              description="My 1000-run grid.")

    NOTE: you can use a family of HDF5 files in case of file size restrictions.
    For example, using `mygrid.h5.%d` for the path when creating/loading grids,
    implies that split files will be `mygrid.h5.0`, `mygrid.h5.1`, and so on.
    When overwriting the file, all the family is overwritten (effectively, the
    files are emptied to avoid remainder split files from previous operations.)

e) The history/profile data can be downsampled to significantly reduce the size
   of the grid. There are three downsampling procedures: one combining all data
   from histories (binary / star 1 / star 2), and two for each star's profile.
   The parameters of the downsampling of histories and profiles can be set via
   the `GRIDPROPERTIES` dictionary (see below). For example, for history down-
   sampling, you can set a tolerance level (between 0 and 1) which corresponds
   to the maximum interpolation error tolerated with respect to the range of
   the parameters:

   history_DS_error = 0.01.

   Columns to be ignored during the downsampling process are set using the
   `history_DS_exclude` parameter. The corresponding parameters for profile
    downsampling are `profile_DS_error` and `profile_DS_exclude`.

f) The input path can be a list of paths to indicate multiple folders with MESA
   runs. Each of the paths can have wildcard characters to select subfolders or
   even runs. E.g.:
   - ["/grid_part1", "/grid_part2"]
   - "/grid_part*"
   - ["/grid/run1", "/grid/run2*"]

   Note that all provided paths are searched recursively, allowing for hierar-
   chical organizations of the MESA runs. For example, the parent folder could
   look like this:

   part_A/
     run1/
     run2/
   part_B/
     run3/
     run4/
   extra_run1/
   extra_run2/

g) Only one run can be included for a specific set of initial parameters, being
   identified by the folder name (excluding the `_grid_index*` part). E.g.,

   reruns/m1_28_m2_11_initial_z_1.42e-02_initial_period_in_days_2_grid_index_0

   will be preferred over

   main/m1_28_m2_11_initial_z_1.42e-02_initial_period_in_days_2_grid_index_102

   if the PSyGrid is created with the appropriate order of input paths:

       mygrid = PSyGrid(["mygrid/main", "mygrid/reruns"])

   The user is notified through warnings about such substitutions.

h) To include specific columns from MESA data files, set the grid properties of
   the form `*_saved_columns`. For example,

       grid.create("mygrid/", "mygrid.h5",
                   star1_history_saved_columns=["star_mass"])

   By default, the provided columns will be added to the default list. In order
   to select the exact set of columns, provide them in a tuple instead of list.


II. READING DATA FROM A GRID
----------------------------
mygrid = PSyGrid("thegrid.h5")  # opens a stored grid

# Indexing...
t1 = mygrid[0]["binary_history"]["age"]
t3 = mygrid[0].binary_history["age"]

tenth_run = mygrid[10]
if tenth_run["binary_history"] is None:
    print("No binary history in 10th run.")


Use `.close()` method to close the HDF5 file, or delete the grid variable. E.g.

    mygrid.close()

    or

    del mygrid


III. OTHER FUNCTIONS
--------------------
a) Printing / logging summary:
    print(mygrid)

    with open("grid_summary.txt", "w") as f:
        f.write(str(mygrid))

b) In `for` loop (`iter` and `next` are supported too):

    for run in mygrid:
        print(run)

c) Getting the number of runs:

    n_runs = len(psygrid)

d) Checking if run index is defined in grid:

    if (13 in grid):
        print("My good luck lasted for", grid[13].binary_history.age[-1], "yr")

e) Getting list or set of runs (i.e., PSyGridView objects):

    run_lst = list(psygrid)
    run_set = set(psygrid)

f) Printing parameters used during the creation of the grid

    print(psygrid.config)


WARNING: reducing reading times when accessing a PSyGrid
--------------------------------------------------------
When accessing a table in a specific run, e.g.,

    mygrid[0].history1

    or

    for run in mygrid:
        do_something_with(run.history1)

the table is loaded from the HDF5 file temporarily. This means that:

    myrun.history1.star_mass + myrun.history1.star_mass

loads the table two times! To reduce reading times, store all the tables
that are going to be needed more than once in local variables:

    myhistory1 = myrun.history1
    myhistory1.star_mass + myhistory1.star_mass

"""


__authors__ = [
    "Konstantinos Kovlakas <Konstantinos.Kovlakas@unige.ch>",
    "Simone Bavera <Simone.Bavera@unige.ch>",
    "Jeffrey Andrews <jeffrey.andrews@northwestern.edu>",
    "Scott Coughlin <scottcoughlin2014@u.northwestern.edu>",
    "Devina Misra <devina.misra@unige.ch>",
    "Kyle Akira Rocha <kylerocha2024@u.northwestern.edu>",
    "Matthias Kruckow <Matthias.Kruckow@unige.ch>",
]


import os
import glob
import json
import ast
import h5py
import numpy as np
import pandas as pd
import tqdm

from posydon.utils.common_functions import check_state_of_star_history_array
from posydon.grids.io import (GridReader, read_initial_values,
                              initial_values_from_dirname, EEP_FILE_EXTENSIONS)
from posydon.grids.termination_flags import (get_flags_from_MESA_run,
                                             check_state_from_history,
                                             get_flag_from_MESA_output,
                                             infer_interpolation_class,
                                             get_detected_initial_RLO,
                                             get_nearest_known_initial_RLO)
from posydon.utils.configfile import ConfigFile
from posydon.utils.common_functions import (orbital_separation_from_period,
                                            initialize_empty_array,
                                            infer_star_state,
                                            get_i_He_depl)
from posydon.utils.limits_thresholds import (THRESHOLD_CENTRAL_ABUNDANCE,
    THRESHOLD_CENTRAL_ABUNDANCE_LOOSE_C
)
from posydon.utils.gridutils import (read_MESA_data_file, read_EEP_data_file,
                                     add_field, join_lists, fix_He_core)
from posydon.visualization.plot2D import plot2D
from posydon.visualization.plot1D import plot1D
from posydon.grids.downsampling import TrackDownsampler
from posydon.grids.scrubbing import scrub, keep_after_RLO, keep_till_central_abundance_He_C
from posydon.utils.ignorereason import IgnoreReason
from posydon.utils.posydonwarning import (Pwarn, Catch_POSYDON_Warnings)


HDF5_MEMBER_SIZE = 2**31 - 1            # maximum HDF5 file size when splitting
H5_UNICODE_DTYPE = h5py.string_dtype()  # dtype for unicode strings in the HDF5
H5_REC_STR_DTYPE = "S70"    # string dtype when writing record arrays to HDF5

# valid keys in a run in the HDF5 file
VALID_KEYS = ["binary_history", "history1", "history2", "final_profile1",
              "final_profile2", "initial_values", "final_values"]

# Acceptable values for `warn` argument of PSyGrid.create()
WARN_VALUES = ["end", "normal", "suppress"]

# Number of termination flags per run
N_FLAGS = 4
N_FLAGS_SINGLE = 2
TERMINATION_FLAG_COLUMNS = ["termination_flag_{}".format(i+1)
                            for i in range(N_FLAGS)]
TERMINATION_FLAG_COLUMNS_SINGLE = ["termination_flag_1", "termination_flag_3"]


# Default columns to be included from history and profile tables
DEFAULT_BINARY_HISTORY_COLS = [
    "model_number", "age",
    "star_1_mass", "star_2_mass",
    "period_days", "binary_separation",
    "lg_system_mdot_1", "lg_system_mdot_2",
    "lg_wind_mdot_1", "lg_wind_mdot_2",
    "lg_mstar_dot_1", "lg_mstar_dot_2",
    "lg_mtransfer_rate", "xfer_fraction",
    "rl_relative_overflow_1", "rl_relative_overflow_2",
    "trap_radius", "acc_radius",
    "t_sync_rad_1", "t_sync_conv_1", "t_sync_rad_2", "t_sync_conv_2"
]

DEFAULT_STAR_HISTORY_COLS = [
    "he_core_mass", "c_core_mass", "o_core_mass",
    "he_core_radius", "c_core_radius", "o_core_radius",
    "center_h1", "center_he4", "center_c12", "center_n14", "center_o16",
    "surface_h1", "surface_he4", "surface_c12", "surface_n14", "surface_o16",
    "c12_c12", "center_gamma", "avg_c_in_c_core",
    "surf_avg_omega", "surf_avg_omega_div_omega_crit",
    "log_LH", "log_LHe", "log_LZ", "log_Lnuc", "log_Teff", "log_L", "log_R",
    "log_center_T", "log_center_Rho",
    "total_moment_of_inertia", "spin_parameter", "log_total_angular_momentum",
    "conv_env_top_mass", "conv_env_bot_mass",
    "conv_env_top_radius", "conv_env_bot_radius",
    "conv_env_turnover_time_g", "conv_env_turnover_time_l_b",
    "conv_env_turnover_time_l_t", "envelope_binding_energy",
    "mass_conv_reg_fortides", "thickness_conv_reg_fortides",
    "radius_conv_reg_fortides",
    "lambda_CE_1cent", "lambda_CE_10cent", "lambda_CE_30cent",
    "co_core_mass", "co_core_radius", "lambda_CE_pure_He_star_10cent",
    "log_L_div_Ledd"
]

DEFAULT_SINGLE_HISTORY_COLS = (["model_number", "star_age", "star_mass"]
                               + DEFAULT_STAR_HISTORY_COLS)

DEFAULT_EEP_HISTORY_COLS = ["star_age", "star_mass"]+DEFAULT_STAR_HISTORY_COLS

DEFAULT_PROFILE_COLS = [
    "radius",
    "mass",
    "logRho",
    "omega",
    "energy",
    "x_mass_fraction_H",
    "y_mass_fraction_He",
    "z_mass_fraction_metals",
    "neutral_fraction_H",
    "neutral_fraction_He",
    "avg_charge_He",
]

EXTRA_STAR_COLS_AT_HE_DEPLETION = [
    "avg_c_in_c_core", "co_core_mass"
]

# Default columns to exclude from computing history/profile downsampling
DEFAULT_HISTORY_DS_EXCLUDE = [
    "model_number",     # not physical
    "age",              # this is the independent variable in binary grids
    "star_age"          # this is the independent variable in single grids
]

DEFAULT_PROFILE_DS_EXCLUDE = [
    "mass",             # this is the independent variable in binary grids
    "star_mass",        # this is the independent variable in single grids
]

# to gain extra compression exclude the follwoing columns
# we support two versions:
# - SMALL: DS_err = 0.01 and exclude COLS+['surface_he4', 'surface_h1']
# - LITE: DS_err = 0.1 and exclude COLS
EXTRA_COLS_DS_EXCLUDE = [
    'acc_radius', 'age', 'avg_charge_He', 'c12_c12', 'c_core_radius',
    'center_c12', 'center_gamma', 'center_n14', 'center_o16', 'co_core_radius',
    'conv_env_bot_mass', 'conv_env_bot_radius', 'conv_env_top_mass',
    'conv_env_top_radius', 'conv_env_turnover_time_g',
    'conv_env_turnover_time_l_b', 'conv_env_turnover_time_l_t', 'energy',
    'envelope_binding_energy', 'he_core_radius', 'lambda_CE_10cent',
    'lambda_CE_1cent', 'lambda_CE_30cent', 'lambda_CE_pure_He_star_10cent',
    'lg_mstar_dot_1', 'lg_mstar_dot_2', 'lg_wind_mdot_1', 'lg_wind_mdot_2',
    'log_LH', 'log_LHe', 'log_LZ', 'log_Lnuc', 'mass',
    'mass_conv_reg_fortides', 'model_number', 'neutral_fraction_H',
    'neutral_fraction_He', 'o_core_radius', 'radius_conv_reg_fortides',
    'rl_relative_overflow_1', 'rl_relative_overflow_2', 'spin_parameter',
    'star_age', 'star_mass', 'surf_avg_omega', 'surf_avg_omega_div_omega_crit',
    'surface_c12', 'surface_h1', 'surface_he4', 'surface_n14', 'surface_o16',
    't_sync_conv_1', 't_sync_conv_2', 't_sync_rad_1', 't_sync_rad_2',
    'thickness_conv_reg_fortides', 'trap_radius', 'x_mass_fraction_H',
    'xfer_fraction', 'y_mass_fraction_He', 'z_mass_fraction_metals',
    'log_L_div_Ledd'
]


# Properties defining the PSyGrid creation. They're saved as metadata.
GRIDPROPERTIES = {
    # file loading parameters
    "description": "",                      # description text
    "max_number_of_runs": None,
    "format": "hdf5",
    "compression": "gzip9",
    # history downsampling parameters
    "history_DS_error": None,
    "history_DS_exclude": DEFAULT_HISTORY_DS_EXCLUDE,
    # profile downsampling parameters
    "profile_DS_error": None,
    "profile_DS_interval": None,
    "profile_DS_exclude": DEFAULT_PROFILE_DS_EXCLUDE,
    # columns to pass to the PSyGrid
    "star1_history_saved_columns": "minimum",
    "star2_history_saved_columns": "minimum",
    "binary_history_saved_columns": "minimum",
    "star1_profile_saved_columns": "minimum",
    "star2_profile_saved_columns": "minimum",
    # Initial/Final value arrays
    "initial_value_columns": None,
    "final_value_columns": None,
    # grid-specific arguments
    "start_at_RLO": False,
    "stop_before_carbon_depletion": False,
    "binary": True,
    "eep": None,    # path to EEP files
    "initial_RLO_fix": False,
    "He_core_fix": True,
    "accept_missing_profile": False,
}


[docs] class PSyGrid: """Class handling a grid of MESA runs encoded in HDF5 format.""" def __init__(self, filepath=None, verbose=False): """Initialize the PSyGrid, and if `filename` is provided, open it. Parameters ---------- verbose : bool If `True`, the objects reports by printing to standard output. path : str or None If not None, it is the path of the HDF5 file to be loaded. """ self._reset() # initialize all except for `filepath` and `verbose` self.filepath = filepath self.verbose = verbose if self.filepath is not None: self.load(self.filepath) def _reset(self): """(Re)set attributes to defaults, except `filename` and `verbose`.""" self.close() self.hdf5 = None # h5py File handle if file is open self.MESA_dirs = [] self.initial_values = None self.final_values = None self.config = ConfigFile() self._make_compression_args() self.n_runs = 0 self.eeps = None def _make_compression_args(self): if "compression" not in self.config: self.compression_args = {} return compression = self.config["compression"] compression = "" if compression is None else compression.lower() if compression == "": compression_args = {} elif compression == "lzf": compression_args = {"compression": "lzf"} elif compression.startswith("gzip"): if compression == "gzip": compression_args = {"compression": "gzip"} else: if len(compression) > 5 or not compression[4].isdigit(): raise ValueError("GZIP compression level must be 0-9.") compression_level = int(compression[4]) compression_args = {"compression": "gzip", "compression_opts": compression_level} else: raise ValueError("Unknown compression `{}`.".format(compression)) self.compression_args = compression_args def _discover_eeps(self, path): self.eeps = {} for extension in EEP_FILE_EXTENSIONS: searchfor = os.path.join(path, "*" + extension) for filename in glob.glob(searchfor): # note that this is consistent with additional `.gz` extension identifier = os.path.basename(filename.split(extension)[-2]) assert identifier not in self.eeps self.eeps[identifier] = filename if len(self.eeps) == 0: self.eeps = None def _say(self, something): if self.verbose: print(something, flush=True)
[docs] def generate_config(self, **grid_kwargs): """Store the grid's configuration in a ConfigFile object.""" self.config = ConfigFile() for key in grid_kwargs: if key not in GRIDPROPERTIES: raise ValueError("`{}` is not a valid parameter name.". format(key)) new_dict = {} for varname in GRIDPROPERTIES: default_value = GRIDPROPERTIES[varname] new_dict[varname] = grid_kwargs.get(varname, default_value) self.config.update(new_dict) self._make_compression_args()
[docs] def create(self, MESA_grid_path, psygrid_path=None, overwrite=False, slim=False, warn="end", fmt="posydon", **grid_kwargs): """Create a new PSyGrid object from a MESA grid and save it on disk. Parameters ---------- MESA_grid_path : str The path of the top directory where the input grid is located. psygrid_path : str The path of the HDF5 file to be created (or overwritten). If not provided, it is assumed that it was defined during initialization. overwrite : bool Whether to overwrite the HDF5 file if it already exists. slim : bool If `True`, only the metadata and initial/final values are stored. warn : str How warnings are handled. If "normal", then warnings are shown when they occur. If "end", they are collected and shown at the end. If "suppress", then warnings are not shown at all. **grid_kwargs : dict Configuration of the new grid, passed as ddditional arguments. """ self._reset() if warn not in WARN_VALUES: raise ValueError("`warn` must be in: {}".format(WARN_VALUES)) if psygrid_path is not None: self.filepath = psygrid_path if self.filepath is None: ValueError("The path of the HDF5 file was not defined.") # Create the output directory if it doesn't exist dirpath = os.path.dirname(self.filepath) if dirpath != "": os.makedirs(dirpath, exist_ok=True) # Create the configuration object from the keyword arguments self.generate_config(**grid_kwargs) # Create the PSyGrid HDF5 file. Handle warnings according to `warn`. # if already exists, am I allowed to overwrite? if not overwrite and os.path.exists(self.filepath): raise Exception("File {} already exists.".format(self.filepath)) # open the HDF5 file and transcribe the grid driver_args = {} if "%d" not in self.filepath else { "driver": "family", "memb_size": HDF5_MEMBER_SIZE} with h5py.File(self.filepath, "w", **driver_args) as hdf5: if warn == "normal": self._create_psygrid(MESA_grid_path, hdf5=hdf5, slim=slim, fmt=fmt) else: with Catch_POSYDON_Warnings(record=True) as caught_warnings: self._create_psygrid(MESA_grid_path, hdf5=hdf5, slim=slim, fmt=fmt) if warn == "suppress": caught_warnings.reset_cache() else: # for consistency assert warn == "end" self.load()
def _create_psygrid(self, MESA_path, hdf5, slim=False, fmt="posydon"): """Generate a PSyGrid instance from a path to MESA data. Parameters ---------- MESA_path : str The path of the directory with the MESA runs. overwrite : bool Whether existing HDF5 file can be overwritten. slim : bool Whether to include runs' history/profile data in the HDF5 file. """ outpath = self.filepath assert outpath is not None # for consistency - expected to be defined binary_grid = self.config["binary"] initial_RLO_fix = self.config["initial_RLO_fix"] start_at_RLO = self.config["start_at_RLO"] stop_before_carbon_depletion = self.config["stop_before_carbon_depletion"] eep = self.config["eep"] if eep is not None: if binary_grid: Pwarn("Selected EEPs, switching to single-star grid.", "ReplaceValueWarning") self.config["binary"] = True binary_grid = True self._discover_eeps(eep) # prepare for loss-less compression compression_args = self.compression_args self._say("Reading input directory structure...") grid = GridReader(MESA_path, fmt=fmt, binary=binary_grid, verbose=self.verbose) # Determine expected number of runs (may not be as many in the end) N_runs = len(grid.runs) if not self.config['max_number_of_runs'] is None: N_runs = min(N_runs, self.config['max_number_of_runs']) # Decide on the columns to be included def decide_columns(key_in_config, defaults): """Get columns to be saved or `None` if setting is 'all').""" inconfig = self.config[key_in_config] if inconfig == "all": return None if inconfig == "minimum": return defaults if np.iterable(inconfig): if isinstance(inconfig, tuple): return list(inconfig) return join_lists(defaults, inconfig) raise Exception("{} setting not recognized.".format(key_in_config)) if binary_grid: H1_DEFAULTS = DEFAULT_STAR_HISTORY_COLS else: if self.eeps is None: H1_DEFAULTS = DEFAULT_SINGLE_HISTORY_COLS else: H1_DEFAULTS = DEFAULT_EEP_HISTORY_COLS BH_columns = decide_columns("binary_history_saved_columns", DEFAULT_BINARY_HISTORY_COLS) H1_columns = decide_columns("star1_history_saved_columns", H1_DEFAULTS) H2_columns = decide_columns("star2_history_saved_columns", DEFAULT_STAR_HISTORY_COLS) P1_columns = decide_columns("star1_profile_saved_columns", DEFAULT_PROFILE_COLS) P2_columns = decide_columns("star2_profile_saved_columns", DEFAULT_PROFILE_COLS) self._say("Infering column names in history tables...") all_history_columns = grid.infer_history_columns( BH_columns, H1_columns, H2_columns) if len(all_history_columns) == 0: raise Exception("PSyGrid object cannot be created: " "No history data in all runs.") self._say("Preparing initial/final_values arrays.") initial_value_columns = all_history_columns.copy() termination_flag_columns = (TERMINATION_FLAG_COLUMNS if binary_grid else TERMINATION_FLAG_COLUMNS_SINGLE) dtype_initial_values = ( [(col, 'f8') for col in initial_value_columns] + [(col, 'f8') for col in ["X", "Y", "Z"]] ) # get extra columns at He depletion for final values extra_final_values_cols = [] for starID in ["S1_", "S2_"]: for col in all_history_columns: if starID in col: for extra_col in EXTRA_STAR_COLS_AT_HE_DEPLETION: colname = starID + extra_col + "_at_He_depletion" extra_final_values_cols.append((colname, 'f8')) break dtype_final_values = ( [(col, 'f8') for col in all_history_columns] + [(col, H5_UNICODE_DTYPE) for col in termination_flag_columns] + ([("interpolation_class", H5_UNICODE_DTYPE)] if binary_grid else []) + extra_final_values_cols ) self.initial_values = initialize_empty_array( np.empty(N_runs, dtype=dtype_initial_values)) self.final_values = initialize_empty_array( np.empty(N_runs, dtype=dtype_final_values)) # in case lists were used, get a proper `dtype` object dtype_initial_values = self.initial_values.dtype dtype_final_values = self.final_values.dtype self._say('Loading MESA data...') #this int array will store the run_index of the i-th run and will be # -1 if the run is not included. run_included_at = np.full(N_runs, -1, dtype=int) run_index = 0 for i in tqdm.tqdm(range(N_runs)): # Select the ith run run = grid.runs[i] ignore = IgnoreReason() newTF1 = '' self._say('Processing {}'.format(run.path)) # Restrict number of runs if limit is set inside config if self.config['max_number_of_runs'] is not None: if run_index == self.config['max_number_of_runs']: self._say("Maximum number of runs reached.") break binary_history = read_MESA_data_file(run.binary_history_path, BH_columns) # if no binary history, ignore this run if binary_grid and binary_history is None: ignore.reason = "ignored_no_binary_history" Pwarn("Ignored MESA run because of missing binary history in:" " {}\n".format(run.path), "MissingFilesWarning") if not initial_RLO_fix: continue if ignore: history1 = None elif self.eeps is None: history1 = read_MESA_data_file(run.history1_path, H1_columns) else: try: eep_path = self.eeps[os.path.basename(run.path)] except KeyError: Pwarn("No matching EEP file for `" + run.path +\ "`. Ignoring run.", "MissingFilesWarning") continue history1 = read_EEP_data_file(eep_path, H1_columns) if self.config["He_core_fix"]: history1 = fix_He_core(history1) if not binary_grid and history1 is None: Pwarn("Ignored MESA run because of missing history in:" " {}\n".format(run.path), "MissingFilesWarning") ignore.reason = "ignored_no_history1" continue if ignore: history2 = None else: history2 = read_MESA_data_file(run.history2_path, H2_columns) if self.config["He_core_fix"]: history2 = fix_He_core(history2) # get values at He depletion and add them to the final values for starID in ["S1_", "S2_"]: if starID == "S1_": h = history1 elif starID == "S2_": h = history2 else: h = None i_He_depl = -1 for col in EXTRA_STAR_COLS_AT_HE_DEPLETION: fvcol = starID + col + "_at_He_depletion" if fvcol in dtype_final_values.names: if ((i_He_depl==-1) and (h is not None)): i_He_depl = get_i_He_depl(h) if ((h is not None) and (col in h.dtype.names) and (i_He_depl>=0)): self.final_values[i][fvcol] = h[col][i_He_depl] else: self.final_values[i][fvcol] = np.nan # scrub histories (unless EEPs are selected or run is ignored) if not ignore and self.eeps is None: # read the model numbers and ages from the histories colname = "model_number" if history1 is not None: if colname in H1_columns: history1_mod = np.int_(history1[colname].copy()) else: history1_mod = read_MESA_data_file(run.history1_path, [colname]) if history1_mod is not None: history1_mod = np.int_(history1_mod[colname]) if history1_mod is not None: len_diff = len(history1)-len(history1_mod) if len_diff<0: #shorten history1_mod history1_mod = history1_mod[:len_diff] Pwarn("Reduce mod in {}\n".format(run.history1_path), "ReplaceValueWarning") elif len_diff>0: #entend history1_mod add_mod = np.full(len_diff,history1_mod[-1]) history1_mod = np.concatenate((history1_mod, add_mod)) Pwarn("Expand mod in {}\n".format(run.history1_path), "ReplaceValueWarning") else: ignore.reason = "corrupted_history1" if "star_age" in H1_columns: history1_age = history1["star_age"].copy() else: history1_age = read_MESA_data_file(run.history1_path, ["star_age"]) if history1_age is not None: history1_age = history1_age["star_age"] if history1_age is not None: len_diff = len(history1)-len(history1_age) if len_diff<0: #shorten history1_age history1_age = history1_age[:len_diff] Pwarn("Reduce age in {}\n".format(run.history1_path), "ReplaceValueWarning") elif len_diff>0: #entend history1_age add_age = np.full(len_diff,history1_age[-1]) history1_age = np.concatenate((history1_age, add_age)) Pwarn("Expand age in {}\n".format(run.history1_path), "ReplaceValueWarning") else: ignore.reason = "corrupted_history1" else: history1_mod = None history1_age = None if history2 is not None: if colname in H2_columns: history2_mod = np.int_(history2[colname].copy()) else: history2_mod = read_MESA_data_file(run.history2_path, [colname]) if history2_mod is not None: history2_mod = np.int_(history2_mod[colname]) if history2_mod is not None: len_diff = len(history2)-len(history2_mod) if len_diff<0: #shorten history2_mod history2_mod = history2_mod[:len_diff] Pwarn("Reduce mod in {}\n".format(run.history2_path), "ReplaceValueWarning") elif len_diff>0: #entend history2_mod add_mod = np.full(len_diff,history2_mod[-1]) history2_mod = np.concatenate((history2_mod, add_mod)) Pwarn("Expand mod in {}\n".format(run.history2_path), "ReplaceValueWarning") else: ignore.reason = "corrupted_history2" if "star_age" in H2_columns: history2_age = history2["star_age"].copy() else: history2_age = read_MESA_data_file(run.history2_path, ["star_age"]) if history2_age is not None: history2_age = history2_age["star_age"] if history2_age is not None: len_diff = len(history2)-len(history2_age) if len_diff<0: #shorten history2_age history2_age = history2_age[:len_diff] Pwarn("Reduce age in {}\n".format(run.history2_path), "ReplaceValueWarning") elif len_diff>0: #entend history2_age add_age = np.full(len_diff,history2_age[-1]) history2_age = np.concatenate((history2_age, add_age)) Pwarn("Expand age in {}\n".format(run.history2_path), "ReplaceValueWarning") else: ignore.reason = "corrupted_history2" else: history2_mod = None history2_age = None if binary_history is not None: if colname in BH_columns: binary_history_mod = np.int_(binary_history[colname].copy()) else: binary_history_mod = read_MESA_data_file( run.binary_history_path, [colname]) if binary_history_mod is not None: binary_history_mod = np.int_(binary_history_mod[colname]) if binary_history_mod is not None: len_diff = len(binary_history)-len(binary_history_mod) if len_diff<0: #shorten binary_history_mod binary_history_mod = binary_history_mod[:len_diff] Pwarn("Reduce mod in {}\n".format(run.binary_history_path), "ReplaceValueWarning") elif len_diff>0: #entend binary_history_mod add_mod = np.full(len_diff,binary_history_mod[-1]) binary_history_mod = np.concatenate((binary_history_mod, add_mod)) Pwarn("Expand mod in {}\n".format(run.binary_history_path), "ReplaceValueWarning") else: ignore.reason = "corrupted_binary_history" if "age" in BH_columns: binary_history_age = binary_history["age"].copy() else: binary_history_age = read_MESA_data_file( run.binary_history_path, ["age"]) if binary_history_age is not None: binary_history_age = binary_history_age["age"] if binary_history_age is not None: len_diff = len(binary_history)-len(binary_history_age) if len_diff<0: #shorten binary_history_age binary_history_age = binary_history_age[:len_diff] Pwarn("Reduce age in {}\n".format(run.binary_history_path), "ReplaceValueWarning") elif len_diff>0: #entend binary_history_age add_age = np.full(len_diff,binary_history_age[-1]) binary_history_age = np.concatenate((binary_history_age, add_age)) Pwarn("Expand age in {}\n".format(run.binary_history_path), "ReplaceValueWarning") else: ignore.reason = "corrupted_binary_history" else: binary_history_mod = None binary_history_age = None # scrub the histories if binary_grid: history1, history2, binary_history = scrub( tables=[history1, history2, binary_history], models=[history1_mod, history2_mod, binary_history_mod], ages=[history1_age, history2_age, binary_history_age] ) else: history1, history2, binary_history = scrub( tables=[history1, None, None], models=[history1_mod, None, None], ages=[history1_age, None, None] ) # if scubbing wiped all the binary history, discard run if binary_history is not None: binary_history_len = len(binary_history) else: binary_history_len = 0 if binary_grid and binary_history_len == 0: ignore.reason = "ignored_scrubbed_history" Pwarn("Ignored MESA run because of scrubbed binary" " history in: {}\n".format(run.path), "InappropriateValueWarning") if not initial_RLO_fix: continue if history1 is not None: history1_len = len(history1) else: history1_len = 0 if not binary_grid and history1_len == 0: ignore.reason = "ignored_scrubbed_history" Pwarn("Ignored MESA run because of scrubbed" " history in: {}\n".format(run.path), "InappropriateValueWarning") continue try: #get mass from binary history init_mass_1 = float(binary_history["star_1_mass"][0]) except: #otherwise get it from directory name params_from_path = initial_values_from_dirname(run.path) init_mass_1 = float(params_from_path[0]) # check whether stop at He depletion is requested if stop_before_carbon_depletion and init_mass_1>=100.0: kept = keep_till_central_abundance_He_C(binary_history, history1, history2, THRESHOLD_CENTRAL_ABUNDANCE, THRESHOLD_CENTRAL_ABUNDANCE_LOOSE_C) binary_history, history1, history2, newTF1 = kept # check whether start at RLO is requested, and chop the history if start_at_RLO: kept = keep_after_RLO(binary_history, history1, history2) if kept is None: ignore.reason = "ignored_no_RLO" if self.verbose: self._say("Ignored MESA run because of no RLO" " in: {}\n".format(run.path)) if not initial_RLO_fix: # TODO: we may want to keep these systems for # allowing rerun grids to inform the old grids that # a system is not RLOing anymore! continue binary_history, history1, history2 = None, None, None else: binary_history, history1, history2 = kept if ignore: binary_history, history1, history2 = None, None, None final_profile1, final_profile2 = None, None else: final_profile1 = read_MESA_data_file( run.final_profile1_path, P1_columns) final_profile2 = read_MESA_data_file( run.final_profile2_path, P2_columns) if not binary_grid and final_profile1 is None: if self.config["accept_missing_profile"]: Pwarn("Including MESA run despite the missing profile" " in: {}\n".format(run.path), "MissingFilesWarning") else: Pwarn("Ignored MESA run because of missing profile in:" " {}\n".format(run.path), "MissingFilesWarning") ignore.reason = "ignored_no_final_profile" continue if binary_history is not None: if binary_history.shape == (): # if history is only one line initial_BH = binary_history.copy() final_BH = binary_history.copy() else: # disentagle arrays with `copy` because the LHSs may change initial_BH = binary_history[0].copy() final_BH = binary_history[-1].copy() else: initial_BH = None final_BH = None if history1 is not None: if history1.shape == (): # if history is only one line initial_H1 = history1.copy() final_H1 = history1.copy() else: # disentagle arrays with `copy` because the LHSs may change initial_H1 = history1[0].copy() final_H1 = history1[-1].copy() else: initial_H1 = None final_H1 = None if history2 is not None: if history2.shape == (): # if history is only one line initial_H2 = history2.copy() final_H2 = history2.copy() else: # disentagle arrays with `copy` because the LHSs may change initial_H2 = history2[0].copy() final_H2 = history2[-1].copy() else: initial_H2 = None final_H2 = None # get some initial values from the `binary_history.data` header # if of course, no RLO fix is applied if binary_grid and not (start_at_RLO or ignore): # this is compatible with `.gz` files bh_header = np.genfromtxt(run.binary_history_path, skip_header=1, max_rows=1, names=True) can_compute_new_separation = True if "star_1_mass" in dtype_initial_values.names: init_mass_1 = bh_header["initial_don_mass"] initial_BH["star_1_mass"] = init_mass_1 else: can_compute_new_separation = False if "star_2_mass" in dtype_initial_values.names: init_mass_2 = bh_header["initial_acc_mass"] initial_BH["star_2_mass"] = init_mass_2 else: can_compute_new_separation = False if "period_days" in dtype_initial_values.names: init_period = bh_header["initial_period_days"] initial_BH["period_days"] = init_period else: can_compute_new_separation = False if ("binary_separation" in dtype_initial_values.names and can_compute_new_separation): init_separation = orbital_separation_from_period( init_period, init_mass_1, init_mass_2) initial_BH["binary_separation"] = init_separation elif not binary_grid and not (start_at_RLO or ignore): # use header to get initial mass in single-star grids # this is compatible with `.gz` files h1_header = np.genfromtxt(run.history1_path, skip_header=1, max_rows=1, names=True) if "S1_star_mass" in dtype_initial_values.names: init_mass_1 = h1_header["initial_m"] initial_H1["star_mass"] = init_mass_1 # get some initial values from the `LOGS1/history.data` header addX = "X" in dtype_initial_values.names addY = "Y" in dtype_initial_values.names addZ = "Z" in dtype_initial_values.names if (addX or addY or addZ) and not ignore: # read abundances from history1 if present, else history2 if history1 is not None: read_from = run.history1_path elif history2 is not None: read_from = run.history2_path else: read_from = None # if not star history file, NaN values if read_from is None: init_X, init_Y, init_Z = np.nan, np.nan, np.nan # try to get metallicity from directory name params_from_path = initial_values_from_dirname(run.path) if (len(params_from_path)==4) or (len(params_from_path)==2): init_Z = params_from_path[-1] else: star_header = np.genfromtxt( read_from, skip_header=1, max_rows=1, names=True) init_Z = star_header["initial_Z"] init_Y = star_header["initial_Y"] init_X = 1.0 - init_Z - init_Y if binary_grid: where_to_add = initial_BH else: where_to_add = initial_H1 if addZ: where_to_add = add_field(where_to_add, [("Z", '<f8')]) where_to_add["Z"] = init_Z if addY: where_to_add = add_field(where_to_add, [("Y", '<f8')]) where_to_add["Y"] = init_Y if addX: where_to_add = add_field(where_to_add, [("X", '<f8')]) where_to_add["X"] = init_X # set initial/final values if initial_BH is not None: for col in initial_BH.dtype.names: self.initial_values[i][col] = initial_BH[col] for col in final_BH.dtype.names: self.final_values[i][col] = final_BH[col] if initial_H1 is not None: for col in initial_H1.dtype.names: newcol = "S1_" + col self.initial_values[i][newcol] = initial_H1[col] for col in final_H1.dtype.names: newcol = "S1_" + col self.final_values[i][newcol] = final_H1[col] if initial_H2 is not None: for col in initial_H2.dtype.names: newcol = "S2_" + col self.initial_values[i][newcol] = initial_H2[col] for col in final_H2.dtype.names: newcol = "S2_" + col self.final_values[i][newcol] = final_H2[col] if not ignore: if addX: self.initial_values[i]["X"] = where_to_add["X"] if addY: self.initial_values[i]["Y"] = where_to_add["Y"] if addZ: self.initial_values[i]["Z"] = where_to_add["Z"] if binary_grid: if ignore: termination_flags = [ignore.reason] * N_FLAGS else: termination_flags = get_flags_from_MESA_run( run.out_txt_path, binary_history=binary_history, history1=history1, history2=history2, start_at_RLO=start_at_RLO, newTF1=newTF1) else: if ignore: termination_flags = [ignore.reason] * N_FLAGS_SINGLE else: termination_flags = [ get_flag_from_MESA_output(run.out_txt_path), check_state_from_history(history1, history1["star_mass"]) ] if ignore: for colname in self.final_values.dtype.names: if (colname.startswith("termination_flag_") or colname.startswith("interpolation_class")): self.final_values[i][colname] = ignore.reason else: self.final_values[i][colname] = np.nan for colname in self.initial_values.dtype.names: self.initial_values[i][colname] = np.nan # now update the initial values from the grid point data grid_point = read_initial_values(run.path) for colname, value in grid_point.items(): if colname in self.initial_values.dtype.names: self.initial_values[i][colname] = value if np.isnan(self.initial_values[i]["Z"]): # try to get metallicity from directory name params_from_path = initial_values_from_dirname(run.path) if (len(params_from_path)==4) or\ (len(params_from_path)==2): self.initial_values[i]["Z"] = params_from_path[-1] else: for flag, col in zip(termination_flags, termination_flag_columns): if flag is None: flag = "" self.final_values[i][col] = flag if binary_grid: self.final_values[i]["interpolation_class"] = \ infer_interpolation_class(*termination_flags[:2]) if ignore: # if not fix requested and failed run, do not include the data if start_at_RLO and (ignore.reason == "ignored_no_RLO"): # allow non-RLOing systems to be included in the grid # so that rerun grid can signify systems as non-RLOing # instead of keeping the old system pass else: continue # no fix... discard this system # if data to be included, downsample and store if not slim: if ignore: hdf5.create_group("/grid/run{}/".format(run_index)) else: # downsample and save histories_DS = downsample_history( binary_history, history1, history2, params=self.config) binary_history_DS, history1_DS, history2_DS = histories_DS final_profile1_DS = downsample_profile( final_profile1, params=self.config) final_profile2_DS = downsample_profile( final_profile2, params=self.config) # store the data arrays_to_save = [binary_history_DS, history1_DS, history2_DS, final_profile1_DS, final_profile2_DS] keys_to_arrays = ['binary_history', 'history1', 'history2', 'final_profile1', 'final_profile2'] for array, name in zip(arrays_to_save, keys_to_arrays): if array is not None: hdf5.create_dataset( "/grid/run{}/{}".format(run_index, name), data=array, **compression_args) # consider the run (and the input directory) included self.MESA_dirs.append(run.path) run_included_at[i] = run_index run_index += 1 # check that new MESA path is added at run_index lenMESA_dirs = len(self.MESA_dirs) if lenMESA_dirs!=run_index: Pwarn("Non synchronous indexing: " +\ "run_index={} != ".format(run_index) +\ "length(MESA_dirs)={}".format(lenMESA_dirs), "InappropriateValueWarning") #general fix for termination_flag in case of initial RLO in binaries if binary_grid and initial_RLO_fix: #create list of already detected initial RLO detected_initial_RLO = get_detected_initial_RLO(self) colnames = ["termination_flag_1", "termination_flag_2", "interpolation_class"] valtoset = ["forced_initial_RLO", "forced_initial_RLO", "initial_MT"] for i in range(N_runs): flag1 = self.final_values[i]["termination_flag_1"] if flag1 != "Terminate because of overflowing initial model": #use grid point data (if existing) to detect initial RLO grid_point = read_initial_values(grid.runs[i].path) if "star_1_mass" in grid_point: mass1 = grid_point["star_1_mass"] else: mass1 = self.initial_values[i]["star_1_mass"] Pwarn("No star_1_mass in "+grid.runs[i].path, "ReplaceValueWarning") if "star_2_mass" in grid_point: mass2 = grid_point["star_2_mass"] else: mass2 = self.initial_values[i]["star_2_mass"] Pwarn("No star_2_mass in "+grid.runs[i].path, "ReplaceValueWarning") if "period_days" in grid_point: period = grid_point["period_days"] else: period = self.initial_values[i]["period_days"] Pwarn("No period_days in "+grid.runs[i].path, "ReplaceValueWarning") nearest = get_nearest_known_initial_RLO(mass1, mass2, detected_initial_RLO) if period<nearest["period_days"]: #set values for colname, value in zip(colnames, valtoset): self.final_values[i][colname] = value #copy values from nearest known system for colname in ["termination_flag_3", "termination_flag_4"]: if colname in nearest: self.final_values[i][colname]=nearest[colname] #reset the initial values to the grid point data for colname, value in grid_point.items(): if colname in self.initial_values.dtype.names: self.initial_values[i][colname] = value #add initial RLO system if not added before if run_included_at[i]==-1: if not slim: hdf5.create_group( "/grid/run{}/".format(run_index)) #include the run (and the input directory) self.MESA_dirs.append(grid.runs[i].path) run_included_at[i] = run_index run_index += 1 #check that new MESA path is added at run_index lenMESA_dirs = len(self.MESA_dirs) if lenMESA_dirs!=run_index: Pwarn("Non synchronous indexing: " +\ "run_index={} != ".format(run_index) +\ "length(MESA_dirs)={}".format(lenMESA_dirs), "InappropriateValueWarning") self._say("Storing initial/final values and metadata to HDF5...") #create new array of initial and finial values with included runs # only and sort it by run_index new_initial_values = initialize_empty_array( np.empty(run_index, dtype=dtype_initial_values)) new_final_values = initialize_empty_array( np.empty(run_index, dtype=dtype_final_values)) for i in range(N_runs): #only use included runs with a valid index if run_included_at[i]>=0: #check for index range if run_included_at[i]>=run_index: Pwarn("run {} has a run_index out of ".format(i) + "range: {}>={}".format(run_included_at[i], run_index), "InappropriateValueWarning") continue #copy initial values or fill with nan if not existing in original for colname in dtype_initial_values.names: if colname in self.initial_values.dtype.names: value = self.initial_values[i][colname] new_initial_values[run_included_at[i]][colname] = value else: new_initial_values[run_included_at[i]][colname] = np.nan #copy final values or fill with nan if not existing in original for colname in dtype_final_values.names: if colname in self.final_values.dtype.names: value = self.final_values[i][colname] new_final_values[run_included_at[i]][colname] = value else: new_final_values[run_included_at[i]][colname] = np.nan #replace old initial/final value array self.initial_values = np.copy(new_initial_values) self.final_values = np.copy(new_final_values) # Store the full table of initial_values hdf5.create_dataset("/grid/initial_values", data=self.initial_values, **compression_args) # Store the full table of final_values (including termination flags) termination_flag_columns = np.array(termination_flag_columns) hdf5.create_dataset("/grid/final_values", data=self.final_values, **compression_args) hdf5.attrs["config"] = json.dumps(str(dict(self.config))) hdf5.create_dataset( "relative_file_paths", data=np.asarray(self.MESA_dirs, dtype=H5_UNICODE_DTYPE), **compression_args)
[docs] def add_column(self, colname, array, where="final_values", overwrite=True): """Add a new numerical column in the final values array.""" if not isinstance(array, np.ndarray): arr = np.asarray(array) else: arr = array if where != "final_values": raise ValueError("Only adding columns to `final_values` allowed.") if len(self) != len(arr): raise ValueError("`array` has {} elements but the grid has {} runs" .format(len(arr), len(self))) if colname in self.final_values.dtype.names: if overwrite: self.final_values[colname] = arr else: raise Exception("Column `{}` already exists in final values.". format(colname)) else: self.final_values = add_field(self.final_values, [(colname, arr.dtype.descr[0][1])]) self.final_values[colname] = arr self.update_final_values()
[docs] def update_final_values(self): """Update the final values in the HDF5 file.""" self._reload_hdf5_file(writeable=True) new_dtype = [] for dtype in self.final_values.dtype.descr: if (dtype[0].startswith("termination_flag") or (dtype[0] == "mt_history") or ("_type" in dtype[0]) or ("_state" in dtype[0]) or ("_class" in dtype[0])): dtype = (dtype[0], H5_REC_STR_DTYPE.replace("U", "S")) new_dtype.append(dtype) if dtype[1] == np.dtype('O'): print(dtype[0]) final_values = self.final_values.astype(new_dtype) del self.hdf5["/grid/final_values"] self.hdf5.create_dataset("/grid/final_values", data=final_values, **self.compression_args) # self.hdf5["/grid/final_values"] = final_values self._reload_hdf5_file(writeable=False)
def _reload_hdf5_file(self, writeable=False): driver_args = {} if "%d" not in self.filepath else { "driver": "family", "memb_size": HDF5_MEMBER_SIZE} self.close() mode = "a" if writeable else "r" self.hdf5 = h5py.File(self.filepath, mode, **driver_args)
[docs] def load(self, filepath=None): """Load up a previously created PSyGrid object from an HDF5 file. Parameters ---------- filepath : str Location of the HDF5 file to be loaded. If not provided, assume it was defined during the initialization (argument: `filepath`). """ self._say("Loading HDF5 grid...") # if not filepath defined, take it from the attribute if filepath is not None: self.filepath = filepath if self.filepath is None: raise ValueError("The path of the HDF5 file was not defined.") driver_args = {} if "%d" not in self.filepath else { "driver": "family", "memb_size": HDF5_MEMBER_SIZE} self.hdf5 = h5py.File(self.filepath, "r", **driver_args) hdf5 = self.hdf5 # load initial/final_values self._say("\tLoading initial/final values...") self.initial_values = hdf5['/grid/initial_values'][()] self.final_values = hdf5['/grid/final_values'][()] # change ASCII to UNICODE in termination flags in `final_values` new_dtype = [] for dtype in self.final_values.dtype.descr: if (dtype[0].startswith("termination_flag") or (dtype[0] == "mt_history") or ("_type" in dtype[0]) or ("_state" in dtype[0]) or ("_class" in dtype[0])): dtype = (dtype[0], H5_REC_STR_DTYPE.replace("S", "U")) new_dtype.append(dtype) self.final_values = self.final_values.astype(new_dtype) # load MESA dirs self._say("\tAcquiring paths to MESA directories...") self.MESA_dirs = list(hdf5["/relative_file_paths"]) # load config & assert that run indices are 0-indexed & sequential # load configuration self._say("\tGetting configuration metadata...") configuration_string = hdf5.attrs["config"] config = ast.literal_eval(ast.literal_eval(configuration_string)) self.config = ConfigFile() self.config.update(config) # enumerate the runs included, and ensure that: # (i) are as many as the MESA_dirs, or 0 (slim psyrid) # (ii) sequential and starting from 0 self._say("\tEnumerating runs and checking integrity of grid...") n_expected = len(self.MESA_dirs) included = np.zeros(n_expected, dtype=bool) for key in hdf5["/grid"].keys(): if key.startswith("run") and key != "run": index = int(key[3:]) if index < 0: raise KeyError("Negative index {} does not make sense". format(index)) if index >= n_expected: raise KeyError("More runs than MESA dirs? Gaps?") if included[index]: raise KeyError("Duplicate key {}?".format(key)) included[index] = True if not np.any(included): # if slim grid... no problem! self.n_runs = 0 elif np.all(included): self.n_runs = n_expected else: raise KeyError("Some runs are missing from the HDF5 grid.") self._say("\tDone.")
[docs] def close(self): """Close the HDF5 file if open.""" if hasattr(self, "hdf5") and self.hdf5 is not None: self.hdf5.close() self.hdf5 = None
def __str__(self): """Return the status of the PSyGrid.""" ret = "PSyGrid instance:\n" if self.filepath is None: ret += "\tNo HDF5 file path. PSyGrid instance is likely empty.\n" return ret ret += "\tLoaded from: {}\n\n".format(self.filepath) # Check if instance has loaded up the .npz data if self.n_runs == 0: ret += "\tNo runs found in the grid (empty or 'slim' version).\n" else: ret += "\t{} runs found. They include:\n".format(self.n_runs) # Cycle through runs to count the number of data files def one_if_ok(data): """1 if data is not None and has at least one row, else 0.""" return 0 if data is None or len(data) == 0 else 1 N_binary_history = 0 N_history1 = 0 N_history2 = 0 N_final_profile1 = 0 N_final_profile2 = 0 for run in self: N_binary_history += one_if_ok(run.binary_history) N_history1 += one_if_ok(run.history1) N_history2 += one_if_ok(run.history2) N_final_profile1 += one_if_ok(run.final_profile1) N_final_profile2 += one_if_ok(run.final_profile2) ret += "\t\t{} binary_history files.\n".format(N_binary_history) ret += "\t\t{} history1 files.\n".format(N_history1) ret += "\t\t{} history2 files.\n".format(N_history2) ret += "\t\t{} final_profile1 files.\n".format(N_final_profile1) ret += "\t\t{} final_profile2 files.\n".format(N_final_profile2) if run.binary_history is not None: ret += "\nColumns in binary_history: {}\n".format( run.binary_history.dtype.names) if run.history1 is not None: ret += "\nColumns in history1: {}\n".format( run.history1.dtype.names) if run.history2 is not None: ret += "\nColumns in history2: {}\n".format( run.history2.dtype.names) if run.final_profile1 is not None: ret += "\nColumns in final_profile1: {}\n".format( run.final_profile1.dtype.names) if run.final_profile2 is not None: ret += "\nColumns in final_profile2: {}\n".format( run.final_profile2.dtype.names) ret += "\n" # Print out initial values array parameters if self.initial_values is None: ret += "Initial values array is not loaded.\n\n" else: ret += "Columns in initial values:\n" ret += str(self.initial_values.dtype.names) + "\n\n" # Print out final values array parameters if self.final_values is None: ret += "Final values array is not loaded.\n\n" else: ret += "Columns in final values:\n" ret += str(self.final_values.dtype.names) + "\n\n" # Print out configuration file parameters ret += "Configuration:\n" config_string = str(self.config) if config_string == "": ret += "\t(empty)\n\n" else: # separate config items with lines that start with a tab ret += "\t" + "\n\t".join(config_string.split("\n")) + "\n\n" # Print out examples of MESA dirs (if grid is not empty) n_dirs = len(self.MESA_dirs) if n_dirs != 0 or self.n_runs != 0: ret += "Relative paths to MESA run directories:\n" if n_dirs == 0: ret += "\tNo MESA directory found.\n" elif n_dirs == 1: ret += "\t{}\n".format(self.MESA_dirs[0]) elif n_dirs == 2: ret += "\t{}\n".format(self.MESA_dirs[0]) ret += "\t{}\n".format(self.MESA_dirs[1]) else: ret += "\t{}\n".format(self.MESA_dirs[0]) ret += "\t...({} other directories)\n".format(n_dirs - 2) ret += "\t{}\n".format(self.MESA_dirs[-1]) return ret def __getitem__(self, index): """Return a PSyRunView instance for the run with index `index`.""" if not np.issubdtype(type(index), int): raise TypeError("Index {} is not of type int".format(index)) if index not in self: raise IndexError("Index {} out of bounds.".format(index)) return PSyRunView(self, index)
[docs] def get_pandas_initial_final(self): """Convert the initial/final values into a single Pandas dataframe.""" df = pd.DataFrame() for key in self.initial_values.dtype.names: new_col_name = "initial_" + key df[new_col_name] = self.initial_values[key] for key in self.final_values.dtype.names: new_col_name = "final_" + key df[new_col_name] = self.final_values[key] return df
def __len__(self): """Return the number of runs in the grid.""" return self.n_runs def __contains__(self, index): """Return True if run with index `index` is in the grid.""" if not np.issubdtype(type(index), int): raise TypeError("Index {} is not of type int".format(index)) return 0 <= index < self.n_runs def __iter__(self): """Allow iteration of runs (`for` loops, lists, sets).""" return PSyGridIterator(self) def __del__(self): """Destructor of the object, closing the HDF5 file if opened.""" # IMPORTANT: do not delete this - it is necessary in interactive mode self.close()
[docs] def rerun(self, path_to_file='./', runs_to_rerun=None, termination_flags=None, new_mesa_flag=None, flags_to_check=None): """Create a CSV file with the PSyGrid initial values to rerun. This methods allows you to create a CSV file with the psygrid initial values you want to rerun. Parameters ---------- path_to_file : str The path to the directory where the new `grid.csv` file will be saved. If the directory does not exist it will be created. runs_to_rerun : list of integers Array containing the indecies of the psygrid runs you want to rerun e.g., runs_to_rerun = [2,3] termination_flags : str or list of str The runs with this termination flag will be rerun. e.g. termination_flags='max_number_retries' new_mesa_flag : dict Dictionary of flags with their value to add as extra columns to the `grid.csv`. The user can specify any arbitrary amount of flags. e.g. new_mesa_flag = {'varcontrol_target': 0.01} flags_to_check : str or list of str The key(s) of flags to check the termination_flags against. e.g. check_flags = 'termination_flag_1' """ # check that the 'path_to_file' exists if not os.path.exists(path_to_file): os.makedirs(path_to_file) # check 'runs_to_rerun' has a valid type if isinstance(runs_to_rerun, list): runs_to_rerun_list = runs_to_rerun elif runs_to_rerun is not None: try: runs_to_rerun_list = list(runs_to_rerun) except: raise TypeError("'runs_to_rerun' should be a list or None.") # check termination flags if termination_flags is not None: if flags_to_check is None: flags_to_check_list = ['termination_flag_1'] elif isinstance(flags_to_check, str): flags_to_check_list = [flags_to_check] elif isinstance(flags_to_check, list): flags_to_check_list = flags_to_check else: try: flags_to_check_list = list(flags_to_check) except: raise TypeError("'flags_to_check' should be a string or a " "list of strings.") if isinstance(termination_flags, str): termination_flags_list = [termination_flags] elif isinstance(termination_flags, list): termination_flags_list = termination_flags else: try: termination_flags_list = list(termination_flags) except: raise TypeError("'termination_flags' should be a string or" " a list of strings.") # getting indices of runs with the termination flag(s) new_runs_to_rerun_list = [] for flag in flags_to_check_list: if flag in self.final_values.dtype.names: for i, tf in enumerate(self.final_values[flag]): if tf in termination_flags_list: new_runs_to_rerun_list.append(i) else: self._say("\tFlag: {} not found. Skip it.".format(flag)) # add runs with the termination flag(s) to the collection of reruns if runs_to_rerun is None: runs_to_rerun_list = new_runs_to_rerun_list else: runs_to_rerun_list += new_runs_to_rerun_list elif runs_to_rerun is None: raise ValueError("Either 'runs_to_rerun' or 'termination_flags' " "has to be specified and therefore different from" " None.") # ensure that each index only appears once runs_to_rerun_list_unique = list(dict.fromkeys(runs_to_rerun_list)) # getting columns for the new grid.csv file column_names = {} if 'star_1_mass' in self.initial_values.dtype.names: column_names['m1'] = 'star_1_mass' if 'star_2_mass' in self.initial_values.dtype.names: column_names['m2'] = 'star_2_mass' if 'period_days' in self.initial_values.dtype.names: column_names['initial_period_in_days'] = 'period_days' if 'Z' in self.initial_values.dtype.names: if len(self.MESA_dirs)>0: # assume first entry of the MESA_dirs is representative of the # grid MESA_dir_name = self.MESA_dirs[0].decode("utf-8") if 'initial_z' in MESA_dir_name: column_names['initial_z'] = 'Z' if 'Zbase' in MESA_dir_name: column_names['Zbase'] = 'Z' if 'new_Z' in MESA_dir_name: column_names['new_Z'] = 'Z' else: raise Exception("No MESA dirs of previous runs in the grid.") # getting the initial data set NDIG = 10 # rounding matches initial point rounding runs_data = {} for key in column_names.keys(): grid_key = column_names[key] runs_data[key] = [] for idx in runs_to_rerun_list_unique: runs_data[key].append(np.around(self.initial_values[grid_key][idx], NDIG)) if len(column_names)>0: n_runs = len(runs_data[list(column_names)[0]]) else: n_runs = 0 # add new_mesa_flag if new_mesa_flag is not None: for key in new_mesa_flag.keys(): runs_data[key] = [new_mesa_flag[key]]*n_runs runs_data_frame = pd.DataFrame(runs_data) runs_data_frame.to_csv(os.path.join(path_to_file, 'grid.csv'), na_rep='nan', index=False)
[docs] def plot2D(self, x_var_str, y_var_str, z_var_str=None, termination_flag='termination_flag_1', grid_3D=None, slice_3D_var_str=None, slice_3D_var_range=None, grid_4D=None, slice_4D_var_str=None, slice_4D_var_range=None, extra_grid=None, slice_at_RLO=False, MARKERS_COLORS_LEGENDS=None, max_cols=3, legend_pos=(3, 3), verbose=False, **kwargs): """Plot a 2D slice of x_var_str vs y_var_str of one or more runs. Parameters ---------- x_var_str : str String of the initial value to plot on the x axis. Allowed strings are `psygrid.initial_values.dtype.names`. y_var_str : str String of the initial value to plot on the y axis. Allowed strings are `psygrid.initial_values.dtype.names`. z_var_str : str String of the initial value to plot on the z axis (displayed as a color). Allowed strings are `psygrid.final_values.dtype.names`, `psygrid.history1.dtype.names`, `psygrid.binary_history.dtype.names`. termination_flag : str Termination flag to display, allowed values are: "termination_flag_1", "termination_flag_2", "termination_flag_3", "termination_flag_4", "all". grid_3D : bool If `True`, the psygrid object is a 3D grid and needs to be sliced. slice_3D_var_str : str Variable along which the 3D space will be sliced. Allowed values are `psygrid.initial_values.dtype.names`. slice_3D_var_range : tuple or a list of tuples Range between which you want to slice the variable slice_3D_var_str e.g., `(2.5,3.)`. In case of a list of tuples, one will get a large plot with one subplot for each tuple in the list. grid_4D : bool If `True`, the psygrid object is a 4D grid and needs to be sliced. slice_4D_var_str : str Variable along which the 4D space will be sliced. Allowed values are `psygrid.initial_values.dtype.names`. slice_4D_var_range : tuple or a list of tuples Range between which you want to slice the variable slice_4D_var_str e.g., `(2.5,3.)`. In case of a list of tuples, one will get a large plot with one subplot for each tuple in the list. extra_grid : object or array of objects If subset of the grid was rerun a or an extention was added, one can overlay the new psygrid by passing it here. slice_at_RLO : bool If `True`, the object plots the tracks until onset of Roche Lobe overflow. MARKERS_COLORS_LEGENDS : dict Each termination flag is associated with a marker shape, size, color and label (cf. `MARKERS_COLORS_LEGENDS` in `plot_defaults.py`). max_cols : int Defines the maximum number of columns of subplots. Default: 3 legend_pos : SubplotSpec (int or tuple) Defines which subplots won't contain an axis but are used to display the legend there. Default: (3, 3) verbose : bool If `True`, the object reports by printing to standard output. **kwargs : dict Dictionary containing extra visualisation options (cf. `PLOT_PROPERTIES` in `plot_defaults.py`. """ plot = plot2D(psygrid=self, x_var_str=x_var_str, y_var_str=y_var_str, z_var_str=z_var_str, termination_flag=termination_flag, grid_3D=grid_3D, slice_3D_var_str=slice_3D_var_str, slice_3D_var_range=slice_3D_var_range, grid_4D=grid_4D, slice_4D_var_str=slice_4D_var_str, slice_4D_var_range=slice_4D_var_range, extra_grid=extra_grid, slice_at_RLO=slice_at_RLO, MARKERS_COLORS_LEGENDS=MARKERS_COLORS_LEGENDS, max_cols=max_cols, legend_pos=legend_pos, verbose=verbose, **kwargs) plot()
[docs] def plot(self, idx, x_var_str, y_var_str, z_var_str=None, history='binary_history', verbose=False, **kwargs): """Plot a 1D track of x_var_str vs y_var_str. Parameters ---------- idx : int or list of int Index or indices of runs to plot. x_var_str : str String of values to plot on the x axis. Allowed strings are the one in `psygrid.history.dtype.names` where "history" needs to be chosen accordingly. y_var_str : str or list of str String or list of stringvalues to plot on the y axis. Allowed strings are the one in `psygrid.history.dtype.names` where "history" needs to be chosen accordingly. z_var_str : str String of values to plot on the z axis (displayed with a color). Allowed strings are the one in `psygrid.history.dtype.names` where "history" needs to be chosen accordingly. history : str The x, y, z variables are read from either: "binary_history", "history1", "history2". verbose : bool If `True`, the object reports by printing to standard output. **kwargs : dict Dictionary containing extra visualisation options (cf. `PLOT_PROPERTIES` in `plot_defaults.py`). """ if isinstance(idx, list): runs = [] for i in idx: runs.append(self[i]) elif isinstance(idx, int): runs = [self[idx]] else: raise ValueError('Invalid idx = {}!'.format(idx)) plot = plot1D(run=runs, x_var_str=x_var_str, y_var_str=y_var_str, z_var_str=z_var_str, history=history, HR=False, verbose=verbose, **kwargs) plot()
[docs] def HR(self, idx, history='history1', states=False, verbose=False, **kwargs): """Plot the HR diagram of one or more runs. Parameters ---------- idx : int or list of int Index or indices of runs to plot. history : str Which history is going to be used. The options are: "binary_history", "history1", or "history2". states : bool If true the HR diagram shows the stellar state with a color map. verbose : bool If `True`, the object reports by printing to standard output. **kwargs : dict Dictionary containing extra visualisation options (cf. `PLOT_PROPERTIES` in `plot_defaults.py`). """ if isinstance(idx, list): runs = [] for i in idx: runs.append(self[i]) elif isinstance(idx, int): runs = [self[idx]] else: raise ValueError('Invalid idx = {}!'.format(idx)) if states: from posydon.binary_evol.singlestar import SingleStar star_states = [] for run in runs: star = SingleStar.from_run(run, history=True, profile=False) star_states.append(check_state_of_star_history_array( star, N=len(star.mass_history))) else: star_states = None plot = plot1D(run=runs, x_var_str=None, y_var_str=None, history=history, star_states=star_states, HR=True, verbose=verbose, **kwargs) plot()
def __eq__(self, other, verbose=True): """Check the equality (in terms of the data) of two PSyGrid objects.""" def say(msg): if verbose: print("COMPARISON:", msg) if not isinstance(other, PSyGrid): say("Only PSyGrid instances should be compared.") return False if len(self) != len(other): say("The grids do not contain the same number of runs ({} != {})". format(len(self), len(other))) for prop in PROPERTIES_TO_BE_CONSISTENT: val1, val2 = self.config[prop], other.config[prop] if val1 != val2: say("Property `{}` is not the same ({} != {})". format(prop, val1, val2)) return False # just report differences for parameters that are not data-relevant for prop in ["compression", "description"]: val1, val2 = self.config[prop], other.config[prop] if val1 != val2: say("Property `{}` is not the same ({} != {})". format(prop, val1, val2)) for tablename in ["initial_values", "final_values"]: arr1, arr2 = getattr(self, tablename), getattr(other, tablename) columns1, columns2 = arr1.dtype.names, arr2.dtype.names if np.all(arr1 == arr2): continue if len(columns1) != len(columns2): say("Number of columns in `{}` do not match ({} != {})". format(tablename, len(columns1), len(columns2))) return False if columns1 != columns2: say("Columns in `{}` do not match:\n{}\n!=\n{}\n".format (tablename, "\n".join(columns1), "\n".join(columns2))) return False for colname in columns1: data1, data2 = arr1[colname], arr2[colname] if len(data1) != len(data2): say("Column `{}` in `{}` not of same length.". format(colname, tablename)) if np.any(data1 != data2): for val1, val2 in zip(data1, data2): if ((val1 == val2) or (val1 is None and val2 is None) or (np.isnan(val1) and np.isnan(val2))): continue say("Column `{}` in `{}` is not the same ({} != {}).". format(colname, tablename, val1, val2)) return False for i, (run1, run2) in enumerate(zip(self, other)): for tablename in ["binary_history", "history1", "history2", "final_profile1", "final_profile2"]: arr1, arr2 = getattr(run1, tablename), getattr(run2, tablename) if arr1 is None and arr2 is None: continue # tables missing for both runs if arr1 is None or arr2 is None: which = "1st" if arr1 is None else "2nd" say("Table `{}` for run `{}` missing in {} grid.". format(tablename, i, which)) return False if np.all(arr1 != arr2): say("Table `{}` for run `{}` is not the same.". format(tablename, i)) return True
[docs] class PSyGridIterator: """Class for iterating runs (i.e., PSyRunView instances) in a PSyGrid.""" def __init__(self, grid): """Initialize by pointing to the PSyGrid object.""" self._grid = grid self._index = 0 def __next__(self): """Return current run and move to the next.""" if self._index >= len(self._grid): raise StopIteration item = self._grid[self._index] self._index += 1 return item
[docs] class PSyRunView: """Access runs in HDF5 grid without opening all data. Example ------- mygrid = PSyGrid("thegrid.h5") # opens a stored grid myrun = mygrid[0] # get a "view" of the first run in the grid print(mygrid[1].history1.star_age) t1 = myrun["binary_history"]["age"] t2 = myrun["binary_history"].age t3 = myrun.binary_history["age"] t4 = myrun.binary_history.age print(mygrid[1].history1.star_age) """ def __init__(self, psygrid, run_index): """Initialize by linking to a PSyGrid object and setting run index.""" self.psygrid = psygrid self.index = run_index def _hdf5_key(self): return "/grid/run{}".format(self.index) def __getitem__(self, key): """Get a table for a specific run using its name as in ["name"].""" if key not in VALID_KEYS: raise KeyError("Key {} not in list of valid keys.".format(key)) if key == "initial_values": return self.psygrid.initial_values[self.index] elif key == "final_values": return self.psygrid.final_values[self.index] hdf5 = self.psygrid.hdf5 if hdf5 is None: raise Exception("The HDF5 file is not open.") fullkey = self._hdf5_key() + "/" + key try: return hdf5[fullkey][()] except KeyError: if key in VALID_KEYS: # key is valid, so the data is just missing return None raise KeyError("There is no key '{}' in '{}'". format(fullkey, self.psygrid.filepath)) def __getattr__(self, key): """Enable the ability of using `.table1` instead of ["table1"].""" return self[key] def __str__(self): """Return a summary of the PSyRunView in a string.""" return "View of the run {} in the file '{}' at key '{}'".format( self.index, self.psygrid.filepath, self._hdf5_key())
[docs] def downsample_history(bh, h1, h2, params): """Downsample history data of a run. Parameters ---------- bh : record array The binary history table. h1 : record array The history of star 1. h2 : record array The history of star 2. params : ConfigFile A ConfigFile instance holding the downsampling parameters. Returns ------- tuple of record arrays The three record arrays corresponding to the downsample binary, star 1, and star2 histories. """ max_err, exclude = params["history_DS_error"], params["history_DS_exclude"] if max_err is None or (bh is None and h1 is None and h2 is None): return bh, h1, h2 # select the data based on which the downsampling will be computed number_of_rows = None dependent = [] for history_table in [bh, h1, h2]: if history_table is not None: if number_of_rows is None: number_of_rows = len(history_table) else: if len(history_table) != number_of_rows: raise Exception("Unequal numbers of rows in histories.") for column_name in history_table.dtype.names: if column_name not in exclude: dependent.append(history_table[column_name]) dependent = np.array(dependent).T if params["binary"]: independent = bh["age"] else: independent = h1["star_age"] # find which rows to keep for having optimum downsampling TD = TrackDownsampler(independent, dependent) TD.find_downsample(max_err) # ... and return only these rows def correct_shape(array): return array[0] if len(array.shape) == 2 else array return (correct_shape(bh[TD.keep]) if bh is not None else None, correct_shape(h1[TD.keep]) if h1 is not None else None, correct_shape(h2[TD.keep]) if h2 is not None else None)
[docs] def downsample_profile(profile, params): """Downsample a profile table. Parameters ---------- profile : record array The profile data with column names defined. params : ConfigFile A ConfigFile instance holding the downsampling parameters. Returns ------- record array The downsampled profile. """ max_err, exclude = params["profile_DS_error"], params["profile_DS_exclude"] max_interval = params["profile_DS_interval"] if max_err is None or profile is None: return profile # select the data based on which the downsampling will be computed dependent = [] for column_name in profile.dtype.names: if column_name not in exclude: dependent.append(profile[column_name][::-1]) # invert order dependent = np.array(dependent).T independent = profile["mass"][::-1] # find which rows to keep for having optimum downsampling TD = TrackDownsampler(independent, dependent) TD.find_downsample(max_err, max_interval) # ... and return only these rows return profile[TD.keep[::-1]] # invert order again
PROPERTIES_ALLOWED = {"format": "hdf5"} PROPERTIES_TO_BE_SET = ["compression", "description"] PROPERTIES_TO_BE_NONE = { "max_number_of_runs": None, "star1_history_saved_columns": None, "star2_history_saved_columns": None, "binary_history_saved_columns": None, "star1_profile_saved_columns": None, "star2_profile_saved_columns": None, "initial_value_columns": None, "final_value_columns": None, } PROPERTIES_TO_BE_CONSISTENT = ["binary", "eep", "start_at_RLO", "stop_before_carbon_depletion", "initial_RLO_fix", "He_core_fix", "accept_missing_profile", "history_DS_error", "history_DS_exclude", "profile_DS_error", "profile_DS_exclude", "profile_DS_interval"] ALL_PROPERTIES = (list(PROPERTIES_ALLOWED.keys()) + PROPERTIES_TO_BE_CONSISTENT + list(PROPERTIES_TO_BE_NONE.keys()) + PROPERTIES_TO_BE_SET)
[docs] def join_grids(input_paths, output_path, compression="gzip9", description="joined", verbose=True): """Join two or more PSyGrid HDF5 files into one.""" def say(something): if verbose: print(something) # open the grids, and figure out the configuration of the new joined grid grids = [] newconfig = None initial_dtype = None final_dtype = None for input_path in input_paths: say("Checking {}".format(input_path)) say(" opening and getting configuration...") grid = PSyGrid(input_path) curconfig = grid.config # check we did not forget about a property, and their consistency assert len(ALL_PROPERTIES) == len(curconfig.keys()) for key in curconfig.keys(): assert key in ALL_PROPERTIES if newconfig is None: # this is the first grid say(" copying configuration...") newconfig = curconfig.deepcopy() initial_dtype = grid.initial_values.dtype final_dtype = grid.final_values.dtype else: say(" checking consistency of configuration and dtypes...") for prop in PROPERTIES_TO_BE_CONSISTENT: newvalue, curvalue = newconfig[prop], curconfig[prop] if newvalue != curvalue: raise Exception("Inconsistent value for `{}`: {} != {}". format(prop, curvalue, newvalue)) if initial_dtype != grid.initial_values.dtype: raise Exception("Initial values dtype's do not match.") if final_dtype != grid.final_values.dtype: raise Exception("Final values dtype's do not match.") grids.append(grid) say("Finalizing grid properties...") # ensure supported property values for prop, allowed_value in PROPERTIES_ALLOWED.items(): if newconfig[prop] != allowed_value: raise ValueError("Only grid with `{}`={} can be joined.". format(prop, allowed_value)) # set fixed property values newconfig.update(PROPERTIES_TO_BE_NONE) # new values set by used assert("compression" in PROPERTIES_TO_BE_SET and "description" in PROPERTIES_TO_BE_SET) newconfig["compression"] = compression newconfig["description"] = description # for each set of initial parameters, indicate the grid and the run to use # (this handles reruns, substituting older runs). say("Inferring which runs to include, from which grid...") initial_params = {} n_substitutions = 0 n_ignored_noRLO = 0 for grid_index, grid in enumerate(grids): for run_index, dir_path in enumerate(grid.MESA_dirs): # get the parameters part from the dir name params_from_path = initial_values_from_dirname(dir_path) if params_from_path in initial_params: n_substitutions += 1 initial_params[params_from_path] = (grid_index, run_index) if (grid[run_index].final_values["termination_flag_1"] == "ignored_no_RLO"): if params_from_path in initial_params: del initial_params[params_from_path] n_ignored_noRLO += 1 say(" {} substitutions detected.".format(n_substitutions)) if newconfig["start_at_RLO"]: say(" {} runs ignored beacuse of no RLOF.".format(n_ignored_noRLO)) say(" {} runs to be joined.".format(len(initial_params))) if (newconfig["initial_RLO_fix"]): say("Determine initial RLO boundary from all grids") detected_initial_RLO = [] colnames = ["termination_flag_1", "termination_flag_2", "interpolation_class"] valtoset = ["forced_initial_RLO", "forced_initial_RLO", "initial_MT"] for grid in grids: new_detected_initial_RLO = get_detected_initial_RLO(grid) for new_sys in new_detected_initial_RLO: exists_already = False for i, sys in enumerate(detected_initial_RLO): # check whether there are double entries if (abs(sys["star_1_mass"]-new_sys["star_1_mass"])<1.0e-5 and abs(sys["star_2_mass"]-new_sys["star_2_mass"])<1.0e-5): exists_already = True # if so, replace old entry if the new one has a larger period if sys["period_days"]<new_sys["period_days"]: detected_initial_RLO[i] = new_sys.copy() # add non existing new entry if not exists_already: detected_initial_RLO.append(new_sys) say("Opening new file...") # open new HDF5 file and start copying runs driver_args = {} if "%d" not in output_path else { "driver": "family", "memb_size": HDF5_MEMBER_SIZE} # prepare for loss-less compression compression = newconfig["compression"] compression = "" if compression is None else compression.lower() if compression == "": compression_args = {} elif compression == "lzf": compression_args = {"compression": "lzf"} elif compression.startswith("gzip"): if compression == "gzip": compression_args = {"compression": "gzip"} else: if len(compression) > 5 or not compression[4].isdigit(): raise ValueError("GZIP compression level must be 0-9.") compression_level = int(compression[4]) compression_args = {"compression": "gzip", "compression_opts": compression_level} else: raise ValueError("Unknown compression `{}`.".format(compression)) with h5py.File(output_path, "w", **driver_args) as new_grid: say("Copying runs...") new_mesa_dirs = [] new_initial_values = [] new_final_values = [] new_index = 0 for grid_index, run_index in initial_params.values(): grid = grids[grid_index] new_mesa_dirs.append(grid.MESA_dirs[run_index]) new_initial_values.append(grid.initial_values[run_index]) new_final_values.append(grid.final_values[run_index]) if (newconfig["initial_RLO_fix"]): flag1 = new_final_values[-1]["termination_flag_1"] if (flag1 != "Terminate because of overflowing initial model" and flag1 != "forced_initial_RLO"): mass1 = new_initial_values[-1]["star_1_mass"] mass2 = new_initial_values[-1]["star_2_mass"] period = new_initial_values[-1]["period_days"] nearest = get_nearest_known_initial_RLO(mass1, mass2, detected_initial_RLO) if period < nearest["period_days"]: # set values for colname, value in zip(colnames, valtoset): new_final_values[-1][colname] = value # copy values from nearest known system for colname in ["termination_flag_3", "termination_flag_4"]: if colname in nearest: new_final_values[-1][colname]=nearest[colname] for subarray in ['binary_history', 'history1', 'history2', 'final_profile1', 'final_profile2']: # ensure group is created even if no data to be written new_grp = "/grid/run{}/".format(new_index) if new_grp not in new_grid: new_grid.create_group(new_grp) # now copy the data new_key = "/grid/run{}/{}".format(new_index, subarray) old_key = "/grid/run{}/{}".format(run_index, subarray) try: old_data = grid.hdf5[old_key] new_grid.create_dataset(new_key, data=old_data, **compression_args) except KeyError: pass new_index += 1 say("Writing initial/final values and metadata...") new_mesa_dirs = np.array(new_mesa_dirs, dtype=H5_UNICODE_DTYPE) new_initial_values = np.array(new_initial_values, dtype=initial_dtype) new_final_dtype = [] for dtype in final_dtype.descr: if (dtype[0].startswith("termination_flag") or ("SN_type" in dtype[0]) or ("_state" in dtype[0]) or ("_class" in dtype[0])): dtype = (dtype[0], H5_REC_STR_DTYPE.replace("U", "S")) new_final_dtype.append(dtype) new_final_values = np.array(new_final_values, dtype=new_final_dtype) new_grid.attrs["config"] = json.dumps(str(dict(newconfig))) new_grid.create_dataset("relative_file_paths", data=new_mesa_dirs, **compression_args) new_grid.create_dataset("/grid/initial_values", data=new_initial_values, **compression_args) new_grid.create_dataset("/grid/final_values", data=new_final_values, **compression_args) say("Grids have been successfully joined.")