Installation¶
This guide provides instructions for installing the bib-dedupe library on different platforms, including Windows, MacOS, and Linux. Follow the steps for your operating system.
Installing Python¶
Windows:
Check if Python v3.8+ is installed: - Open the search field of Windows and enter
CMD
. Click on Command Prompt. - Writepython --version
in the Command Prompt and press ENTER. - If this returns a number higher than 3.8, Python is installed, and you can proceed with installing bib-dedupe. If not, see the installation steps below.Install Python: - Go to https://python.org/downloads to download Python for Windows. - Install Python for Windows. IMPORTANT: Check the box “Add Python to your PATH environment variable” during the installation.
MacOS:
Check if Python v3.8+ is installed: - Open your Launchpad (in your Dock) and search for “Terminal”. Click on Terminal. - Write
python --version
in the Terminal and press ENTER. - If this returns a number higher than 3.8, Python is installed, and you can proceed with installing bib-dedupe. If not, see the installation steps below.Install Python: - Go to https://docs.conda.io/en/latest/miniconda.html#macosx-installers to download the latest version. - Install Miniconda for MacOS.
Linux:
Check if Python v3.8+ is installed: - Open a terminal window. - Write
python3 --version
orpython --version
in the terminal and press ENTER. - If this returns a number higher than 3.8, Python is installed, and you can proceed with installing bib-dedupe. If not, proceed to the next step.Install Python: - Use your distribution’s package manager to install Python 3.8 or newer. For example, on Ubuntu or Debian-based systems, you can use:
sudo apt update sudo apt install python3 python3-pip
This will install Python and pip, the Python package manager, which is required for the next steps.
Installing pip¶
pip is automatically installed with Python versions 3.4 and above when using the official Python installer for Windows and MacOS, and through package managers for Linux distributions. If you need to manually install or upgrade pip, you can do so by running:
python3 -m ensurepip --upgrade
Installing bib-dedupe¶
With Python v3.8+ and pip installed, you can now install bib-dedupe:
Open the Command Prompt (Windows) or Terminal (MacOS/Linux).
Install bib-dedupe using pip:
pip install bib-dedupe
Starting bib-dedupe¶
After installation, you can start using bib-dedupe in a Python script, Jupyter notebook, or another Python package. To use bib-dedupe in a Python script, you can create a new file (e.g., deduplication.py) with the following content:
import pandas as pd
from bib_dedupe.bib_dedupe import merge
# Load your bibliographic dataset into a pandas DataFrame
records_df = pd.read_csv("records.csv")
# Get the merged_df
merged_df = merge(records_df)
merged_df.to_csv("merged_records.csv", index=False)
Replace records.csv with the path to your bibliographic dataset. Run the script using the following command:
python deduplication.py