src.dataset.base_dataset module

src.dataset.base_dataset module#

Bases: object

Rasterized map data structure.

map#

2D raster map array, where 0 usually means walkable area and 1 means obstacle.

Type:: np.array

xmin#

Minimum x value of the map in world coordinates.

Type:: float

ymin#

Minimum y value of the map in world coordinates.

Type:: float

xmax#

Maximum x value of the map in world coordinates.

Type:: float

ymax#

Maximum y value of the map in world coordinates.

Type:: float

map: numpy.array = None#

xmin: float = None#

ymin: float = None#

xmax: float = None#

ymax: float = None#

__init__(map: numpy.array | None = None, xmin: float | None = None, ymin: float | None = None, xmax: float | None = None, ymax: float | None = None) → None#

exception src.dataset.base_dataset.EmptyDatasetError[source]#: Bases: BaseException

class src.dataset.base_dataset.BaseDataset(*args: Any, **kwargs: Any)[source]#

Bases: Dataset

Base class for all pedestrian-trajectory prediction datasets.

Provides shared logic for sample splitting, resampling, coordinate normalization, cache management, and the PyTorch DataLoader collate_fn. Dataset-specific loading logic is implemented by subclasses through load_data.

__init__(name: str, args: Namespace, df_data: pandas.DataFrame, map_data: RasterizedMap | None = None)[source]#

Initialize the dataset.

Parameters:

name (str) – Dataset name, e.g. “eth” or “zara01”.
args (Namespace) – Global configuration containing fps, hist_step, pred_step, and related fields.
df_data (pd.DataFrame) – DataFrame containing all trajectory data. Required columns: [‘f’, ‘id’, ‘x’, ‘y’, ‘type’]. - f: frame index - id: trajectory ID - x, y: coordinates - type: ‘pedestrian’ or ‘vehicle’
map_data (RasterizedMap, optional) – Scene map data.

classmethod load_data(args) → BaseDataset[source]#

[Abstract method] Load data from a file path and return a dataset instance.

Subclasses must implement this method to handle dataset-specific raw formats.

Parameters:: args (Namespace) – Global configuration.
Returns:: Loaded dataset instance.
Return type:: BaseDataset
Raises:: NotImplementedError – Raised when a subclass does not implement this method.

split_samples(df_data, use_tqdm=True)[source]#

Split continuous trajectory data into sliding-window samples for training and evaluation.

Samples are generated according to args.hist_step, args.pred_step, and args.skip_step. Each sample contains the pedestrian and vehicle history, future labels, and context for the current scene.

Parameters:

df_data (pd.DataFrame) – DataFrame containing complete trajectories.
use_tqdm (bool, optional) – Whether to show a progress bar.

Returns:

List of sample dictionaries, including:

pos: current positions (#ped, 2)
vel: current velocities (#ped, 2)
des: destinations (#ped, 2)
spd: desired speeds (#ped, 1)
hst: history trajectories (#ped, hist_step, 2)
veh: vehicle history (#veh, hist_step+1, 2)
future_acc: future acceleration labels (#ped, pred_step, 2)
…

Return type:

List[dict]

static collate_fn(batch)[source]#

Custom DataLoader collate function for padding variable-length sequences.

Parameters:

batch (List[dict]) – Sample list returned by __getitem__.

Returns:

Batched tensors after padding and stacking.: Includes keys such as ‘pos’, ‘vel’, ‘ped_length’, and ‘veh_length’. Padding values are usually 0 for coordinates or -1 for IDs.

Return type:

dict

static resample_dataframe(df_data, raw_fps=30, target_fps=2.5)[source]#

Resample trajectory data to the target frame rate expected by the model.

Parameters:

df_data (pd.DataFrame) – Raw trajectory data.
raw_fps (float) – Original frame rate.
target_fps (float) – Target frame rate.

Returns:

Resampled DataFrame with interpolated coordinates and updated frame indices.

Return type:

pd.DataFrame

static normalize_xy(df_data, map_data)[source]#

Apply Z-score-style normalization to coordinate data.

The current implementation uses std = 1.0, so it effectively centers only. The commented code shows how full standard-deviation scaling would work.

Parameters:

df_data (pd.DataFrame) – Trajectory data.
map_data (RasterizedMap) – Map data.

Returns:

(normalized_df_data, updated_map_data)

Return type:

tuple

static save_cache(obj, cache_path)[source]#: Serialize and save a dataset object to disk cache.

static load_cache(cache_path)[source]#: Load a dataset object from disk cache.

src.dataset.base_dataset module

Contents

src.dataset.base_dataset module#