src.dataset.base_dataset module#
- class src.dataset.base_dataset.RasterizedMap(map: numpy.array | None = None, xmin: float | None = None, ymin: float | None = None, xmax: float | None = None, ymax: float | None = None)[source]#
Bases:
objectRasterized map data structure.
- map#
2D raster map array, where 0 usually means walkable area and 1 means obstacle.
- Type:
np.array
- xmin#
Minimum x value of the map in world coordinates.
- Type:
float
- ymin#
Minimum y value of the map in world coordinates.
- Type:
float
- xmax#
Maximum x value of the map in world coordinates.
- Type:
float
- ymax#
Maximum y value of the map in world coordinates.
- Type:
float
- map: numpy.array = None#
- xmin: float = None#
- ymin: float = None#
- xmax: float = None#
- ymax: float = None#
- __init__(map: numpy.array | None = None, xmin: float | None = None, ymin: float | None = None, xmax: float | None = None, ymax: float | None = None) None#
- class src.dataset.base_dataset.BaseDataset(*args: Any, **kwargs: Any)[source]#
Bases:
DatasetBase class for all pedestrian-trajectory prediction datasets.
Provides shared logic for sample splitting, resampling, coordinate normalization, cache management, and the PyTorch DataLoader collate_fn. Dataset-specific loading logic is implemented by subclasses through load_data.
- __init__(name: str, args: Namespace, df_data: pandas.DataFrame, map_data: RasterizedMap | None = None)[source]#
Initialize the dataset.
- Parameters:
name (str) – Dataset name, e.g. “eth” or “zara01”.
args (Namespace) – Global configuration containing fps, hist_step, pred_step, and related fields.
df_data (pd.DataFrame) – DataFrame containing all trajectory data. Required columns: [‘f’, ‘id’, ‘x’, ‘y’, ‘type’]. - f: frame index - id: trajectory ID - x, y: coordinates - type: ‘pedestrian’ or ‘vehicle’
map_data (RasterizedMap, optional) – Scene map data.
- classmethod load_data(args) BaseDataset[source]#
[Abstract method] Load data from a file path and return a dataset instance.
Subclasses must implement this method to handle dataset-specific raw formats.
- Parameters:
args (Namespace) – Global configuration.
- Returns:
Loaded dataset instance.
- Return type:
- Raises:
NotImplementedError – Raised when a subclass does not implement this method.
- split_samples(df_data, use_tqdm=True)[source]#
Split continuous trajectory data into sliding-window samples for training and evaluation.
Samples are generated according to args.hist_step, args.pred_step, and args.skip_step. Each sample contains the pedestrian and vehicle history, future labels, and context for the current scene.
- Parameters:
df_data (pd.DataFrame) – DataFrame containing complete trajectories.
use_tqdm (bool, optional) – Whether to show a progress bar.
- Returns:
- List of sample dictionaries, including:
pos: current positions (#ped, 2)
vel: current velocities (#ped, 2)
des: destinations (#ped, 2)
spd: desired speeds (#ped, 1)
hst: history trajectories (#ped, hist_step, 2)
veh: vehicle history (#veh, hist_step+1, 2)
future_acc: future acceleration labels (#ped, pred_step, 2)
…
- Return type:
List[dict]
- static collate_fn(batch)[source]#
Custom DataLoader collate function for padding variable-length sequences.
- Parameters:
batch (List[dict]) – Sample list returned by __getitem__.
- Returns:
- Batched tensors after padding and stacking.
Includes keys such as ‘pos’, ‘vel’, ‘ped_length’, and ‘veh_length’. Padding values are usually 0 for coordinates or -1 for IDs.
- Return type:
dict
- static resample_dataframe(df_data, raw_fps=30, target_fps=2.5)[source]#
Resample trajectory data to the target frame rate expected by the model.
- Parameters:
df_data (pd.DataFrame) – Raw trajectory data.
raw_fps (float) – Original frame rate.
target_fps (float) – Target frame rate.
- Returns:
Resampled DataFrame with interpolated coordinates and updated frame indices.
- Return type:
pd.DataFrame
- static normalize_xy(df_data, map_data)[source]#
Apply Z-score-style normalization to coordinate data.
The current implementation uses std = 1.0, so it effectively centers only. The commented code shows how full standard-deviation scaling would work.
- Parameters:
df_data (pd.DataFrame) – Trajectory data.
map_data (RasterizedMap) – Map data.
- Returns:
(normalized_df_data, updated_map_data)
- Return type:
tuple