{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "feb369bc",
   "metadata": {},
   "source": [
    "# Write compliance results to a ``pandas.DataFrame`` object"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7e4fbc86",
   "metadata": {},
   "source": [
    "Granta MI BoM Analytics presents compliance results in a hierarchical data structure. Alternatively, you can\n",
    "represent the data in a tabular data structure, where each row contains a reference to the parent row.\n",
    "This example shows how compliance data can be translated from one format to another, making use\n",
    "of a ``pandas.DataFrame`` object to store the tabulated data."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4296e274",
   "metadata": {},
   "source": [
    "## Perform a compliance query"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9df109ea",
   "metadata": {},
   "source": [
    "The first step is to perform a compliance query on an assembly that results in a deeply\n",
    "nested structure. The following code is presented without explanation. For more information, see the\n",
    "[Perform a Part Compliance Query](../2_Compliance_Queries/2-3_Part_compliance.ipynb) example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a0df765c",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from ansys.grantami.bomanalytics import Connection, indicators, queries\n",
    "\n",
    "server_url = \"http://my_grantami_server/mi_servicelayer\"\n",
    "cxn = Connection(server_url).with_credentials(\"user_name\", \"password\").connect()\n",
    "svhc = indicators.WatchListIndicator(\n",
    "    name=\"SVHC\",\n",
    "    legislation_ids=[\"Candidate_AnnexXV\"],\n",
    "    default_threshold_percentage=0.1,\n",
    ")\n",
    "part_query = (\n",
    "    queries.PartComplianceQuery()\n",
    "    .with_record_history_ids([565060])\n",
    "    .with_indicators([svhc])\n",
    ")\n",
    "part_result = cxn.run(part_query)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac349139",
   "metadata": {},
   "source": [
    "The ``part_result`` object contains the compliance result for every subitem. This is ideal for understanding\n",
    "compliance at a certain *level* of the structure, For example, you can display the compliance for each item directly\n",
    "under the root part."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b547dc9c",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "for part in part_result.compliance_by_part_and_indicator[0].parts:\n",
    "    print(\n",
    "        f\"Part ID: {part.record_history_identity}, \"\n",
    "        f\"Compliance: {part.indicators['SVHC'].flag}\"\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9b5ec7f6",
   "metadata": {},
   "source": [
    "However, this structure makes it difficult to compare items at different levels. To do that, you want to flatten the\n",
    "data into a tabular structure."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "07d9b2f3",
   "metadata": {},
   "source": [
    "## Flatten the hierarchical data structure"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f0e346bc",
   "metadata": {},
   "source": [
    "You want to flatten the data into a ``list`` of ``dict`` objects, where each ``dict`` object represents an item in the\n",
    "hierarchy and each value in the ``dict`` object represents a property of this item. You can this use this structure\n",
    "can then directly or use it to construct a ``pandas.DataFrame`` object."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "72878641",
   "metadata": {
    "lines_to_next_cell": 2
   },
   "source": [
    "First, define a helper function to transform a ``ComplianceQueryResult`` object into a ``dict`` object. In addition to\n",
    "storing properties that are intrinsic to the item (such as the ID, type, and SVHC result), you want to store\n",
    "structural information, such as the level of the item and the ID of its parent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "06d33d97",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "def create_dict(item, item_type, level, parent_id):\n",
    "    \"\"\"Add a BoM item to a list\"\"\"\n",
    "    item_id = item.record_history_identity\n",
    "    indicator = item.indicators[\"SVHC\"]\n",
    "    row = {\n",
    "        \"Item\": item_id,\n",
    "        \"Parent\": parent_id,\n",
    "        \"Type\": item_type,\n",
    "        \"SVHC\": indicator,\n",
    "        \"Level\": level,\n",
    "    }\n",
    "    return row"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2ea0fcc0",
   "metadata": {
    "lines_to_next_cell": 2
   },
   "source": [
    "To help with the flattening process, you also define a schema, which describes which child item types each item\n",
    "type can contain."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f73f5615",
   "metadata": {
    "lines_to_next_cell": 2,
    "tags": []
   },
   "outputs": [],
   "source": [
    "schema = {\n",
    "    \"Part\": [\"Part\", \"Specification\", \"Material\", \"Substance\"],\n",
    "    \"Specification\": [\"Specification\", \"Coating\", \"Material\", \"Substance\"],\n",
    "    \"Material\": [\"Substance\"],\n",
    "    \"Coating\": [\"Substance\"],\n",
    "    \"Substance\": [],\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "95d0176f",
   "metadata": {},
   "source": [
    "The function itself performs the flattening via a stack-based approach, where the children of the item currently\n",
    "being processed are iteratively added to the ``items_to_process`` stack. Because this stack is being both modified and\n",
    "iterated over, you must use a ``while`` loop and ``.pop()`` statement instead of a ``for`` loop."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12e1f027",
   "metadata": {
    "lines_to_next_cell": 2
   },
   "source": [
    "The stack uses a special type of collection called a ``deque``, which is similar to a ``list`` but is optimized for\n",
    "these sorts of stack-type use cases involving repeated calls to ``.pop()`` and ``.extend()`` statements."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "08a7062b",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from collections import deque\n",
    "\n",
    "\n",
    "def flatten_bom(root_part):\n",
    "    result = []  # List to contain all dicts\n",
    "\n",
    "    # The stack contains a deque of tuples: (item_object, item_type, level, parent_id)\n",
    "    # First seed the stack with the root part\n",
    "    items_to_process = deque([(root_part, \"Part\", 0, None)])\n",
    "\n",
    "    while items_to_process:\n",
    "        # Get the next item from the stack\n",
    "        item_object, item_type, level, parent = items_to_process.pop()\n",
    "        # Create the dict\n",
    "        row = create_dict(item_object, item_type, level, parent)\n",
    "        # Append it to the result list\n",
    "        result.append(row)\n",
    "\n",
    "        # Compute the properties for the child items\n",
    "        item_id = item_object.record_history_identity\n",
    "        child_items = schema[item_type]\n",
    "        child_level = level + 1\n",
    "\n",
    "        # Add the child items to the stack\n",
    "        if \"Part\" in child_items:\n",
    "            items_to_process.extend([(p, \"Part\", child_level, item_id)\n",
    "                                     for p in item_object.parts])\n",
    "        if \"Specification\" in child_items:\n",
    "            items_to_process.extend([(s, \"Specification\", child_level, item_id)\n",
    "                                     for s in item_object.specifications])\n",
    "        if \"Material\" in child_items:\n",
    "            items_to_process.extend([(m, \"Material\", child_level, item_id)\n",
    "                                     for m in item_object.materials])\n",
    "        if \"Coating\" in child_items:\n",
    "            items_to_process.extend([(c, \"Coating\", child_level, item_id)\n",
    "                                     for c in item_object.coatings])\n",
    "        if \"Substance\" in child_items:\n",
    "            items_to_process.extend([(s, \"Substance\", child_level, item_id)\n",
    "                                     for s in item_object.substances])\n",
    "\n",
    "    # When the stack is empty, the while loop exists. Return the result list.\n",
    "    return result"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e1d5312",
   "metadata": {},
   "source": [
    "Finally, call the preceding function against the results from the compliance query and use the list to create a\n",
    "``pandas.DataFrame`` object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "24d00184",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "data = flatten_bom(part_result.compliance_by_part_and_indicator[0])\n",
    "df_full = pd.DataFrame(data)\n",
    "print(f\"{len(df_full)} rows\")\n",
    "df_full.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3f2075e4",
   "metadata": {},
   "source": [
    "## Postprocess the ``pandas.DataFrame`` object"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0a87206f",
   "metadata": {},
   "source": [
    "Now that you have the data in a ``pandas.DataFrame`` object, you can perform operations across all levels of the\n",
    "structure more easily. For example, you can delete all rows that are less than the 'Above Threshold' state, retaining\n",
    "only rows that are non-compliant. (Note that this reduces the number of rows significantly.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c889c918",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "threshold = indicators.WatchListFlag.WatchListAboveThreshold\n",
    "df_non_compliant = df_full.drop(df_full[df_full.SVHC < threshold].index)\n",
    "print(f\"{len(df_non_compliant)} rows\")\n",
    "df_non_compliant.head()"
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "formats": "ipynb,py:light"
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}