gnrs.deduplication¶
gnrs.deduplication.deduplication_task¶
This module provides the DuplicateRemovalTask class for removing duplicate crystal structures from the pool.
This source code is licensed under the BSD-3-Clause license found in the LICENSE file in the root directory of this source tree.
- class gnrs.deduplication.deduplication_task.DuplicateRemovalTask[source]¶
Bases:
TaskABCTask for removing duplicate crystal structures from the pool.
Initialize the duplicate removal task.
- Parameters:
comm – MPI communicator
config – Config dictionary
gnrs_info – Genarris info dictionary
- TASK_NAME = 'duplicate_removal'¶
- pack_settings()[source]¶
Pack settings needed for duplicate removal.
- Returns:
Task settings dictionary
- Return type:
- print_settings(task_set)[source]¶
Print task settings in a formatted table.
- Parameters:
task_set (dict) – Task settings dictionary
- Return type:
None
gnrs.deduplication.dedup¶
Duplicate structure removal using pymatgen StructureMatcher.
Structures are grouped by space group for computational efficiency, then within each space group a reference structure is broadcast to all MPI ranks and compared against the remaining candidates in parallel.
This source code is licensed under the BSD-3-Clause license found in the LICENSE file in the root directory of this source tree.
- gnrs.deduplication.dedup.group_by_spg(structs)[source]¶
Group structures by space group.
- Parameters:
structs (dict[str, ase.atoms.Atoms]) – {name: Atoms}.
- Returns:
{name: Atoms, …}}.
- Return type:
{spg
- gnrs.deduplication.dedup.dedup_group(pool, matcher, spg, energy_key)[source]¶
Remove duplicates from a space group in parallel.
- Master picks one candidate from the pool and broadcasts its
pymatgen Structure to all ranks.
- The remaining structures are scattered across ranks; each rank
tests matcher.fit(candidate, local_struct) in parallel.
- Match results are gathered. Master collects the duplicate
cluster, selects the best structure, and removes duplicates from the pool until the pool is empty.
- Parameters:
pool (dict[str, ase.atoms.Atoms]) – {name: Atoms} — all structures in this space group (only meaningful on master; ignored on workers).
matcher (pymatgen.analysis.structure_matcher.StructureMatcher) – Configured StructureMatcher instance.
spg (int | None) – Space group.
energy_key (str | None) – Key in Atoms.info for energy, or None.
- Returns:
Atoms} — unique structures in the space group.
- Return type:
{name