Analyzing US Lobbying Data in Neo4J
Let's analyze US lobbying data in Neo4j.
Join the DZone community and get the full member experience.
Join For FreeOverview
In a previous post, I loaded a subset of federal lobbying data into Neo4J; almost 19,000 filings from 2019 Q1.
The database schema created by my project includes 6 distinct nodes and 7 distinct relationships.
After loading the Neo4J database, Cypher queries are used to interrogate the data and attempt to learn something. The bulk of this article shows different Cypher statements and reviews the results.
The results shown represent the subset I've loaded; the more data is loaded, the more relevant the results are. Nevertheless, the potential of what can be learned is started, with your challenge to continue to find other ways to query the data.
You might also like: Querying Graphs with Neo4j
Definitions
- Filing: Represents a single lobbying effort, specifying the detail information about the filing — i.e., unique identifier, period represent, dollar amount spent, date of filing, detailed description — and has relationships to additional details.
- Client: Special interest groups — e.g., corporations, non-profits, industries, national and international governments — advocating for/against legislation or regulations under consideration by the federal government.
- Lobbyist: A professional hired by the client to present the client's position and persuade the federal government to take the client's position with regards to proposed legislation and regulations.
- Registrant: The organization performing lobbying activities on behalf of the client, registered with the US government. Clients may lobby on their own behalf as both client and registrant or may hire firms who specialize in lobbying and hire lobbyists.
- Government Entity: A department, regulatory agency, commission, or branch of government lobbied. Multiple entities are usually associated with a single filing; by far the most lobbied entities are the legislative branches, the Senate and House of Representatives.
- Issue: Filings are assigned to general categories to simplify reporting, such as Education, Transportation, and Natural Resources. The specifics of the lobbying effort is described with each filing.
Government Entities
Understanding who's being lobbied provides one data point in understanding where the power lies in the federal government; the importance of a government entity often corresponds to the lobbying of that government entity.
Filings-to-Government Entity Distribution
Each filing targets at least one and often more government entities, including some filings that target a very broad set of government entities. In total, the current data set contains 290 government entities, but only 2% of the filings target 10 or more government agencies.
MATCH (f:Filing)-[:TARGETED_AT]-(g:GovernmentEntity)
WITH f.filingId AS id, COUNT(g) AS cnt
RETURN cnt, count(id)
ORDER BY cnt
+-----------------+
| cnt | count(id) |
+-----------------+
| 1 | 936 |
| 2 | 6881 |
| 3 | 3080 |
| 4 | 1696 |
| 5 | 854 |
| 6 | 466 |
| 7 | 291 |
| 8 | 172 |
| 9 | 140 |
| 10 | 88 |
| 11 | 75 |
| 12 | 43 |
| 13 | 37 |
| 14 | 37 |
| 15 | 26 |
| 16 | 24 |
| 17 | 11 |
| 18 | 11 |
| 19 | 8 |
| 20 | 7 |
| 21 | 4 |
| 22 | 8 |
| 23 | 5 |
| 24 | 2 |
| 25 | 5 |
| 26 | 4 |
| 27 | 2 |
| 28 | 1 |
| 29 | 1 |
| 30 | 1 |
| 36 | 1 |
| 37 | 1 |
| 40 | 1 |
| 46 | 1 |
+-----------------+
34 rows available after 82 ms, consumed after another 0 ms
neo4j>
Targeted Government Entities
Not surprisingly, the legislative branches — the House of Representatives and the Senate, where laws originate — are targetted ten times more than any other government entity, receiving over $1B in filings of the current data set.
MATCH (f:Filing)-[:TARGETED_AT]->(g:GovernmentEntity)
RETURN g.name, COUNT(f), SUM(f.amount)
ORDER BY COUNT(f) DESC
LIMIT 25
+-------------------------------------------------------------------------------+
| g.name | COUNT(f) | SUM(f.amount) |
+-------------------------------------------------------------------------------+
| "HOUSE OF REPRESENTATIVES" | 14070 | 1071281019 |
| "SENATE" | 13832 | 1068025331 |
| "Health & Human Services, Dept of (HHS)" | 1050 | 221133190 |
| "Transportation, Dept of (DOT)" | 864 | 130519342 |
| "White House Office" | 857 | 240088068 |
| "Treasury, Dept of" | 828 | 274988850 |
| "Environmental Protection Agency (EPA)" | 820 | 163365449 |
| "Agriculture, Dept of (USDA)" | 802 | 132632023 |
| "Centers For Medicare and Medicaid Services (CMS)" | 801 | 137015012 |
| "Commerce, Dept of (DOC)" | 756 | 221285052 |
| "Defense, Dept of (DOD)" | 680 | 121776600 |
| "Interior, Dept of (DOI)" | 648 | 83118013 |
| "U.S. Trade Representative (USTR)" | 587 | 215733186 |
| "Office of Management & Budget (OMB)" | 582 | 172992161 |
| "Executive Office of the President (EOP)" | 578 | 173938319 |
| "Energy, Dept of" | 532 | 121617783 |
| "Homeland Security, Dept of (DHS)" | 516 | 155406355 |
| "State, Dept of (DOS)" | 507 | 153597551 |
| "Army, Dept of (Corps of Engineers)" | 396 | 39805101 |
| "Food & Drug Administration (FDA)" | 389 | 80056350 |
| "Labor, Dept of (DOL)" | 375 | 112349309 |
| "Education, Dept of" | 357 | 56165422 |
| "Justice, Dept of (DOJ)" | 271 | 86116472 |
| "Federal Communications Commission (FCC)" | 244 | 90486758 |
| "Veterans Affairs, Dept of (VA)" | 241 | 52929414 |
+-------------------------------------------------------------------------------+
25 rows available after 72 ms, consumed after another 0 ms
neo4j>
Other government entities are more rarely lobbied, though to be statistically relevant requires a larger data set. Both the number of filings and the dollar amount used lobbying are significantly less than the most-lobbied government entities.
MATCH (f:Filing)-[:TARGETED_AT]->(g:GovernmentEntity)
WHERE f.amount > 0
RETURN g.name, COUNT(f), SUM(f.amount)
ORDER BY SUM(f.amount)
LIMIT 25;
+---------------------------------------------------------------------------------+
| g.name | COUNT(f) | SUM(f.amount) |
+---------------------------------------------------------------------------------+
| "Office of Faith-Based & Community Initiatives" | 1 | 10000 |
| "Defense Threat Reduction Agency (DTRA)" | 1 | 10000 |
| "U.S. Botanic Garden" | 1 | 10000 |
| "Architectural & Transportation Barriers Compliance" | 1 | 20000 |
| "Agency for Toxic Substances & Disease Registry" | 1 | 20000 |
| "Natl Institute of Justice" | 1 | 20000 |
| "Program Support Center" | 1 | 35000 |
| "U.S. Commission on International Religious Freedom" | 2 | 40000 |
| "Voice of America" | 1 | 40000 |
| "U.S. Chemical Safety & Hazard Investigation Board" | 1 | 40000 |
| "U.S. Secret Service" | 2 | 40000 |
| "Natl Foundation on the Arts & Humanities" | 2 | 40000 |
| "St. Lawrence Seaway Development Corporation" | 1 | 40000 |
| "Defense Nuclear Facilities Safety Board" | 1 | 40000 |
| "Office of Government Ethics (OGE)" | 1 | 50000 |
| "U.S. Mint" | 2 | 60000 |
| "USA Freedom Corps" | 1 | 60000 |
| "Bureau of Engraving & Printing" | 2 | 60000 |
| "Appalachian Regional Commission" | 5 | 70000 |
| "Natl Capital Planning Commission (NCPC)" | 1 | 73680 |
| "Tennessee Valley Authority (TVA)" | 2 | 75000 |
| "Natl Archives & Records Administration (NARA)" | 2 | 85000 |
| "Bureau of Prisons (BOP)" | 2 | 90000 |
| "Office of Justice Program" | 4 | 90000 |
| "Natl Council on Disability" | 3 | 102522 |
+---------------------------------------------------------------------------------+
25 rows available after 77 ms, consumed after another 0 ms
neo4j>
Grouped Government Entities
Knowing that a filing targets one or more government entities, is the set of government entities targeted by each filing interesting? Does the grouping of government entities provide any insight into how lobbying is organized?
The Cypher function COLLECT is a way to flatten data, turning individual values into an array. Here, we'll collect the government entity names and use them to find filings targeting each group.
MATCH (f:Filing)-[:TARGETED_AT]-(g:GovernmentEntity)
WHERE f.amount > 0
WITH f, g ORDER BY g.name
WITH f, COLLECT(DISTINCT g.name) AS gnames
RETURN count(f), gnames
ORDER BY count(f) DESC
LIMIT 30
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| count(f) | gnames |
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| 5806 | ["HOUSE OF REPRESENTATIVES", "SENATE"] |
| 290 | ["HOUSE OF REPRESENTATIVES"] |
| 187 | ["HOUSE OF REPRESENTATIVES", "Health & Human Services, Dept of (HHS)", "SENATE"] |
| 174 | ["Agriculture, Dept of (USDA)", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 168 | ["HOUSE OF REPRESENTATIVES", "SENATE", "Treasury, Dept of"] |
| 159 | ["Centers For Medicare and Medicaid Services (CMS)", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 158 | ["HOUSE OF REPRESENTATIVES", "SENATE", "Transportation, Dept of (DOT)"] |
| 158 | ["Defense, Dept of (DOD)", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 156 | ["Centers For Medicare and Medicaid Services (CMS)", "HOUSE OF REPRESENTATIVES", "Health & Human Services, Dept of (HHS)", "SENATE"] |
| 155 | ["SENATE"] |
| 127 | ["HOUSE OF REPRESENTATIVES", "Interior, Dept of (DOI)", "SENATE"] |
| 97 | ["Environmental Protection Agency (EPA)", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 83 | ["Education, Dept of", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 82 | ["HOUSE OF REPRESENTATIVES", "SENATE", "White House Office"] |
| 80 | ["Executive Office of the President (EOP)", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 61 | ["Federal Communications Commission (FCC)", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 61 | ["Commerce, Dept of (DOC)", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 59 | ["Energy, Dept of", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 56 | ["Army, Dept of (Corps of Engineers)", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 53 | ["HOUSE OF REPRESENTATIVES", "SENATE", "Securities & Exchange Commission (SEC)"] |
| 50 | ["HOUSE OF REPRESENTATIVES", "SENATE", "U.S. Trade Representative (USTR)"] |
| 47 | ["Food & Drug Administration (FDA)", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 46 | ["HOUSE OF REPRESENTATIVES", "Homeland Security, Dept of (DHS)", "SENATE"] |
| 43 | ["HOUSE OF REPRESENTATIVES", "SENATE", "Veterans Affairs, Dept of (VA)"] |
| 40 | ["HOUSE OF REPRESENTATIVES", "SENATE", "State, Dept of (DOS)"] |
| 32 | ["Federal Aviation Administration (FAA)", "HOUSE OF REPRESENTATIVES", "SENATE"] |
| 31 | ["HOUSE OF REPRESENTATIVES", "Labor, Dept of (DOL)", "SENATE"] |
| 25 | ["Environmental Protection Agency (EPA)"] |
| 24 | ["HOUSE OF REPRESENTATIVES", "Internal Revenue Service (IRS)", "SENATE", "Treasury, Dept of"] |
| 24 | ["Agriculture, Dept of (USDA)", "HOUSE OF REPRESENTATIVES", "SENATE", "U.S. Trade Representative (USTR)"] |
+-------------------------------------------------------------------------------------------------------------------------------------------------+
30 rows available after 150 ms, consumed after another 0 ms
neo4j>
As previously discussed, the House of Representatives and Senate are disproportionately lobbied when compared to other government entities, obscuring other data that is overwhelmed by the sheer quantity. It's easier to assume that most filings already target the House and Senate and instead filter those government entities from any results, allowing other data to become visible.
MATCH (f:Filing)-[]-(g:GovernmentEntity)
WHERE f.amount > 0 AND g.name <> 'SENATE' AND g.name <> 'HOUSE OF REPRESENTATIVES'
WITH f, g ORDER BY g.name
WITH f, COLLECT(DISTINCT g.name) AS gnames
RETURN count(f), gnames
ORDER BY count(f) DESC
LIMIT 30
+-----------------------------------------------------------------------------------------------------------+
| count(f) | gnames |
+-----------------------------------------------------------------------------------------------------------+
| 198 | ["Health & Human Services, Dept of (HHS)"] |
| 188 | ["Agriculture, Dept of (USDA)"] |
| 187 | ["Treasury, Dept of"] |
| 176 | ["Centers For Medicare and Medicaid Services (CMS)"] |
| 175 | ["Transportation, Dept of (DOT)"] |
| 169 | ["Defense, Dept of (DOD)"] |
| 161 | ["Centers For Medicare and Medicaid Services (CMS)", "Health & Human Services, Dept of (HHS)"] |
| 141 | ["Interior, Dept of (DOI)"] |
| 127 | ["Environmental Protection Agency (EPA)"] |
| 100 | ["White House Office"] |
| 98 | ["Executive Office of the President (EOP)"] |
| 89 | ["Education, Dept of"] |
| 72 | ["Energy, Dept of"] |
| 67 | ["Commerce, Dept of (DOC)"] |
| 64 | ["Federal Communications Commission (FCC)"] |
| 63 | ["Army, Dept of (Corps of Engineers)"] |
| 58 | ["State, Dept of (DOS)"] |
| 57 | ["U.S. Trade Representative (USTR)"] |
| 57 | ["Securities & Exchange Commission (SEC)"] |
| 50 | ["Food & Drug Administration (FDA)"] |
| 49 | ["Homeland Security, Dept of (DHS)"] |
| 47 | ["Veterans Affairs, Dept of (VA)"] |
| 35 | ["Federal Aviation Administration (FAA)"] |
| 32 | ["Labor, Dept of (DOL)"] |
| 26 | ["Internal Revenue Service (IRS)", "Treasury, Dept of"] |
| 25 | ["Office of Management & Budget (OMB)"] |
| 24 | ["Agriculture, Dept of (USDA)", "U.S. Trade Representative (USTR)"] |
| 23 | ["Justice, Dept of (DOJ)"] |
| 22 | ["Army, Dept of (Corps of Engineers)", "Office of Management & Budget (OMB)"] |
| 22 | ["Housing & Urban Development, Dept of (HUD)"] |
+-----------------------------------------------------------------------------------------------------------+
30 rows available after 74 ms, consumed after another 0 ms
neo4j>
Immediately, we can see differences in the results: the filings for the Department of Health and Human Services increase from 187 to 198; for the Department of the Treasury, it's 168 to 187. By removing the House and Senate, we have more filings that can be grouped together.
Going forward, the Senate and House government entities are always filtered out to provide (hopefully) more relevant data to look at.
Client
Now that we understand who's being lobbied — the government entities — the next question is, what organizations are responsible for the actual lobbying? These are the clients.
Most Targeted By Client
Instead of just see what government entities are being lobbied, we'll query Neo4J to which government entities each client is lobbying.
MATCH (c:Client)<-[:ON_BEHALF_OF]-(f:Filing)-[:TARGETED_AT]->(g:GovernmentEntity)
WHERE f.amount > 0 AND g.name <> 'SENATE' AND g.name <> 'HOUSE OF REPRESENTATIVES'
RETURN COUNT(f) AS cnt, SUM(f.amount) AS amount, c.name, g.name
ORDER BY c.name, g.name, COUNT(f)
LIMIT 30
+--------------------------------------------------------------------------------------------------------------------------+
| cnt | amount | c.name | g.name |
+--------------------------------------------------------------------------------------------------------------------------+
| 1 | 30000 | "(\"BIO\") BIOTECHNOLOGY INDUSTRY ORGANIZATION" | "Securities & Exchange Commission (SEC)" |
| 1 | 10000 | "1-800 Contacts" | "Federal Aviation Administration (FAA)" |
| 1 | 50000 | "1-800 Contacts" | "Homeland Security, Dept of (DHS)" |
| 1 | 50000 | "1-800 Contacts" | "Justice, Dept of (DOJ)" |
| 1 | 50000 | "?BAYER CORPORATION FORMERLY KNOWN AS BAYER U S LLC" | "Commerce, Dept of (DOC)" |
| 1 | 50000 | "?BAYER CORPORATION FORMERLY KNOWN AS BAYER U S LLC" | "Defense, Dept of (DOD)" |
| 1 | 50000 | "?BAYER CORPORATION FORMERLY KNOWN AS BAYER U S LLC" | "Health & Human Services, Dept of (HHS)" |
| 1 | 50000 | "?BAYER CORPORATION FORMERLY KNOWN AS BAYER U S LLC" | "Interior, Dept of (DOI)" |
| 1 | 50000 | "?BAYER CORPORATION FORMERLY KNOWN AS BAYER U S LLC" | "Labor, Dept of (DOL)" |
| 1 | 50000 | "?BAYER CORPORATION FORMERLY KNOWN AS BAYER U S LLC" | "Natl Institutes of Health (NIH)" |
| 1 | 150000 | "?BAYER CORPORATION FORMERLY KNOWN AS BAYER U S LLC" | "Patent & Trademark Office (PTO)" |
| 1 | 50000 | "?BAYER CORPORATION FORMERLY KNOWN AS BAYER U S LLC" | "White House Office" |
| 1 | 50000 | "A1A Senior Ventures" | "Homeland Security, Dept of (DHS)" |
| 1 | 50000 | "A1A Senior Ventures" | "Interior, Dept of (DOI)" |
| 1 | 50000 | "A1A Senior Ventures" | "Transportation, Dept of (DOT)" |
| 1 | 50000 | "A1A Senior Ventures" | "U.S. Forest Service" |
| 1 | 30000 | "AAPAN" | "Centers For Medicare and Medicaid Services (CMS)" |
| 1 | 30000 | "AAPAN" | "Health & Human Services, Dept of (HHS)" |
| 1 | 80000 | "ABBOTT LABORATORIES" | "Defense, Dept of (DOD)" |
| 1 | 150000 | "ABBVIE" | "Defense, Dept of (DOD)" |
| 1 | 150000 | "ABBVIE" | "Energy, Dept of" |
| 1 | 50000 | "ABBVIE" | "Executive Office of the President (EOP)" |
| 1 | 50000 | "ABBVIE" | "Health & Human Services, Dept of (HHS)" |
| 1 | 150000 | "ABBVIE" | "Natl Security Council (NSC)" |
| 1 | 150000 | "ABBVIE" | "State, Dept of (DOS)" |
| 1 | 150000 | "ABBVIE" | "Vice President of the U.S." |
| 2 | 90000 | "ACCSES" | "Centers For Medicare and Medicaid Services (CMS)" |
| 1 | 20000 | "ACCSES" | "Education, Dept of" |
| 1 | 30000 | "ACCSES" | "Energy, Dept of" |
| 1 | 20000 | "ACCSES" | "Executive Office of the President (EOP)" |
+--------------------------------------------------------------------------------------------------------------------------+
30 rows available after 139 ms, consumed after another 0 ms
neo4j>
I'm limiting the results to 30 rows as in total there are almost 6800 client-entity combinations in the current data set. Ordering the data by client name is not as interesting as ordering by quantity or total amount of filings.
MATCH (c:Client)<-[:ON_BEHALF_OF]-(f:Filing)-[:TARGETED_AT]->(g:GovernmentEntity)
WHERE f.amount > 0 AND g.name <> 'SENATE' AND g.name <> 'HOUSE OF REPRESENTATIVES'
RETURN COUNT(f) AS cnt, SUM(f.amount) AS amount, c.name, g.name
ORDER BY COUNT(f) DESC, SUM(f.amount) DESC, c.name, g.name
LIMIT 30
+-----------------------------------------------------------------------------------------+
| cnt | amount | c.name | g.name |
+-----------------------------------------------------------------------------------------+
| 389 | 196343337 | "Sargento Foods" | "Health & Human Services, Dept of (HHS)" |
| 384 | 248793275 | "Sargento Foods" | "Treasury, Dept of" |
| 349 | 202729861 | "Sargento Foods" | "Commerce, Dept of (DOC)" |
| 342 | 206329495 | "Sargento Foods" | "U.S. Trade Representative (USTR)" |
| 341 | 212655325 | "Sargento Foods" | "White House Office" |
| 318 | 148180897 | "Sargento Foods" | "Environmental Protection Agency (EPA)" |
| 286 | 120455645 | "Sargento Foods" | "Centers For Medicare and Medicaid Services (CMS)" |
| 284 | 119196823 | "Sargento Foods" | "Agriculture, Dept of (USDA)" |
| 253 | 161183186 | "Sargento Foods" | "Office of Management & Budget (OMB)" |
| 249 | 112260093 | "Sargento Foods" | "Transportation, Dept of (DOT)" |
| 216 | 139569860 | "Sargento Foods" | "State, Dept of (DOS)" |
| 214 | 146681815 | "Sargento Foods" | "Homeland Security, Dept of (DHS)" |
| 206 | 106726509 | "Sargento Foods" | "Labor, Dept of (DOL)" |
| 203 | 109231592 | "Sargento Foods" | "Energy, Dept of" |
| 200 | 105022774 | "Sargento Foods" | "Defense, Dept of (DOD)" |
| 199 | 156337753 | "Sargento Foods" | "Executive Office of the President (EOP)" |
| 177 | 72937656 | "Sargento Foods" | "Food & Drug Administration (FDA)" |
| 159 | 68988563 | "Sargento Foods" | "Interior, Dept of (DOI)" |
| 146 | 51758254 | "Sargento Foods" | "Education, Dept of" |
| 129 | 89923961 | "Sargento Foods" | "Internal Revenue Service (IRS)" |
| 125 | 109895621 | "Sargento Foods" | "Natl Economic Council (NEC)" |
| 119 | 86354009 | "Sargento Foods" | "Federal Communications Commission (FCC)" |
| 109 | 90478573 | "Sargento Foods" | "Securities & Exchange Commission (SEC)" |
| 107 | 78043781 | "Sargento Foods" | "Justice, Dept of (DOJ)" |
| 91 | 60551493 | "Sargento Foods" | "Federal Aviation Administration (FAA)" |
| 90 | 48474831 | "Sargento Foods" | "Veterans Affairs, Dept of (VA)" |
| 81 | 31471344 | "Sargento Foods" | "Housing & Urban Development, Dept of (HUD)" |
| 77 | 62542091 | "Sargento Foods" | "Federal Trade Commission (FTC)" |
| 76 | 32586516 | "Sargento Foods" | "Army, Dept of (Corps of Engineers)" |
| 75 | 67313587 | "Sargento Foods" | "Federal Reserve System" |
+-----------------------------------------------------------------------------------------+
30 rows available after 128 ms, consumed after another 0 ms
neo4j>
Definitely more interesting — and more insightful — on a per-government entity basis, Sargento Foods lobbies more than any other client. This is not surprising; in the current data set, Sargento has six times the number of filings than the next highest client. However, results by an individual government entity is misleading, as each filing can target multiple government entities.
Grouped Government Entities By Client
Instead of considering government entities individually per filing and client, let's group the government entities targeted by a filing, also breaking out by client.
MATCH (c:Client)<-[]-(f:Filing)-[]-(g:GovernmentEntity)
WHERE f.amount > 0 AND g.name <> 'SENATE' AND g.name <> 'HOUSE OF REPRESENTATIVES'
WITH c, f, g ORDER BY g.name
WITH c, f, COLLECT(DISTINCT g.name) AS gnames
RETURN COUNT(f), SUM(f.amount), c.name, gnames, count(f)
ORDER BY COUNT(f) DESC
LIMIT 30
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| COUNT(f) | SUM(f.amount) | c.name | gnames | count(f) |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 42 | 7193662 | "Sargento Foods" | ["Centers For Medicare and Medicaid Services (CMS)"] | 42 |
| 42 | 7284778 | "Sargento Foods" | ["Centers For Medicare and Medicaid Services (CMS)", "Health & Human Services, Dept of (HHS)"] | 42 |
| 33 | 2419344 | "Sargento Foods" | ["Agriculture, Dept of (USDA)"] | 33 |
| 29 | 5379655 | "Sargento Foods" | ["Health & Human Services, Dept of (HHS)"] | 29 |
| 28 | 1890876 | "Sargento Foods" | ["Education, Dept of"] | 28 |
| 22 | 9998200 | "Sargento Foods" | ["Treasury, Dept of"] | 22 |
| 21 | 2947811 | "Sargento Foods" | ["U.S. Trade Representative (USTR)"] | 21 |
| 20 | 3849000 | "Sargento Foods" | ["White House Office"] | 20 |
| 18 | 1457608 | "Sargento Foods" | ["Environmental Protection Agency (EPA)"] | 18 |
| 15 | 1100914 | "Sargento Foods" | ["Interior, Dept of (DOI)"] | 15 |
| 14 | 884884 | "Sargento Foods" | ["Transportation, Dept of (DOT)"] | 14 |
| 14 | 680000 | "BAYER CORPORATION" | ["Interior, Dept of (DOI)"] | 14 |
| 13 | 5094734 | "Sargento Foods" | ["Federal Communications Commission (FCC)"] | 13 |
| 12 | 1522431 | "Sargento Foods" | ["Centers For Medicare and Medicaid Services (CMS)", "Food & Drug Administration (FDA)", "Health & Human Services, Dept of (HHS)"] | 12 |
| 12 | 2592425 | "Sargento Foods" | ["Securities & Exchange Commission (SEC)"] | 12 |
| 11 | 200000 | "RAI Services Company" | ["Transportation, Dept of (DOT)"] | 11 |
| 11 | 258000 | "ST CHARLES COUNTY" | ["Transportation, Dept of (DOT)"] | 11 |
| 11 | 1024000 | "Sargento Foods" | ["Homeland Security, Dept of (DHS)"] | 11 |
| 10 | 2425092 | "Sargento Foods" | ["Agriculture, Dept of (USDA)", "U.S. Trade Representative (USTR)"] | 10 |
| 10 | 2203500 | "Sargento Foods" | ["Defense, Dept of (DOD)"] | 10 |
| 10 | 1615844 | "Sargento Foods" | ["Commerce, Dept of (DOC)"] | 10 |
| 10 | 236000 | "Palantir Technologies" | ["Agriculture, Dept of (USDA)"] | 10 |
| 10 | 994559 | "Sargento Foods" | ["Executive Office of the President (EOP)"] | 10 |
| 9 | 710000 | "Sargento Foods" | ["Federal Housing Finance Agency (FHFA)", "Housing & Urban Development, Dept of (HUD)", "Treasury, Dept of"] | 9 |
| 9 | 2825000 | "Sargento Foods" | ["State, Dept of (DOS)"] | 9 |
| 9 | 216600 | "Verde Technologies" | ["Agriculture, Dept of (USDA)"] | 9 |
| 9 | 245346 | "Verde Technologies" | ["Centers For Medicare and Medicaid Services (CMS)"] | 9 |
| 8 | 3415000 | "Sargento Foods" | ["Energy, Dept of"] | 8 |
| 8 | 520000 | "ST CHARLES COUNTY" | ["Health & Human Services, Dept of (HHS)"] | 8 |
| 8 | 315000 | "Sargento Foods" | ["Agriculture, Dept of (USDA)", "Environmental Protection Agency (EPA)"] | 8 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
30 rows available after 3185 ms, consumed after another 0 ms
neo4j>
Sargento dominates again, but now see that many filings appear to target a small number of government entities. The results also show clients other than Sargento, since there's no duplicate counting of filings.
Sargento Foods
Based on the previous analysis and the available filings, with the most potential for interesting discovery is Sargento Foods. The remainder of this article focuses on this single client, analyzing the lobbying done, and therefore the filings made, on Sargento's behalf.
Your analysis might result in different results, based on the data loaded.
Distribution
Earlier, I suggested that Sargento lobbied few government entities in each filing based on previous queries but without complete information. By adapting the initial distribution query, I can see the distribution for this client.
MATCH (c:Client {name:'Sargento Foods'})<-[:ON_BEHALF_OF]-(f:Filing)-[:TARGETED_AT]-(g:GovernmentEntity)
WITH f.filingId AS id, COUNT(g) AS cnt
RETURN cnt, count(id)
ORDER BY cnt
+-----------------+
| cnt | count(id) |
+-----------------+
| 1 | 120 |
| 2 | 1017 |
| 3 | 479 |
| 4 | 390 |
| 5 | 281 |
| 6 | 184 |
| 7 | 150 |
| 8 | 111 |
| 9 | 90 |
| 10 | 61 |
| 11 | 52 |
| 12 | 35 |
| 13 | 29 |
| 14 | 34 |
| 15 | 23 |
| 16 | 20 |
| 17 | 9 |
| 18 | 7 |
| 19 | 7 |
| 20 | 7 |
| 21 | 4 |
| 22 | 8 |
| 23 | 3 |
| 24 | 2 |
| 25 | 4 |
| 26 | 4 |
| 27 | 2 |
| 28 | 1 |
| 29 | 1 |
| 30 | 1 |
| 36 | 1 |
| 37 | 1 |
| 40 | 1 |
| 46 | 1 |
+-----------------+
34 rows available after 46 ms, consumed after another 0 ms
neo4j>
We learn from these results that the majority of Sargento filings target two government entities, but do show a decent number of filings targeting double-digit government entities; in one extreme, 46 targeted government entities.
Issues
Similar to government entities, each filing is about one or more issues (codes) that allow grouping filings by general or high-level topics. Sargento filings are about 79 distinct issues.
MATCH (c:Client {name:'Sargento Foods'})<-[ON_BEHALF_OF]-(f:Filing)-[:ABOUT]->(i:Issue)
WHERE f.amount > 0
RETURN i.code, COUNT(f) AS cnt, SUM(f.amount) AS amount
ORDER BY cnt DESC, i.code
LIMIT 30
+-------------------------------------------------------------------+
| i.code | cnt | amount |
+-------------------------------------------------------------------+
| "TAXATION/INTERNAL REVENUE CODE" | 942 | 443144120 |
| "BUDGET/APPROPRIATIONS" | 903 | 285352318 |
| "HEALTH ISSUES" | 700 | 275873762 |
| "TRADE (DOMESTIC/FOREIGN)" | 599 | 318872038 |
| "TRANSPORTATION" | 462 | 193790118 |
| "MEDICARE/MEDICAID" | 385 | 157444480 |
| "ENERGY/NUCLEAR" | 381 | 163411615 |
| "EDUCATION" | 348 | 106016869 |
| "DEFENSE" | 330 | 140394155 |
| "ENVIRONMENT/SUPERFUND" | 327 | 147936930 |
| "LABOR ISSUES/ANTITRUST/WORKPLACE" | 306 | 149661469 |
| "AGRICULTURE" | 289 | 89572321 |
| "IMMIGRATION" | 271 | 110495186 |
| "FINANCIAL INSTITUTIONS/INVESTMENTS/SECURITIES" | 254 | 150232138 |
| "GOVERNMENT ISSUES" | 233 | 108789011 |
| "HOMELAND SECURITY" | 233 | 163490262 |
| "COPYRIGHT/PATENT/TRADEMARK" | 173 | 158536910 |
| "NATURAL RESOURCES" | 172 | 65210210 |
| "CONSUMER ISSUES/SAFETY/PRODUCTS" | 171 | 126362248 |
| "BANKING" | 167 | 69001243 |
| "TELECOMMUNICATIONS" | 167 | 119834729 |
| "SCIENCE/TECHNOLOGY" | 162 | 73518980 |
| "CLEAN AIR AND WATER (QUALITY)" | 159 | 76302221 |
| "INSURANCE" | 142 | 85804650 |
| "FOREIGN RELATIONS" | 141 | 79950395 |
| "RETIREMENT" | 131 | 81021876 |
| "VETERANS" | 125 | 35415343 |
| "AVIATION/AIRLINES/AIRPORTS" | 113 | 67184550 |
| "LAW ENFORCEMENT/CRIME/CRIMINAL JUSTICE" | 111 | 52840872 |
| "FOOD INDUSTRY (SAFETY, LABELING, ETC.)" | 109 | 34169940 |
+-------------------------------------------------------------------+
30 rows available after 40 ms, consumed after another 0 ms
neo4j>
Again, similar to government entities, analyzing issues individually is less interesting than looking at which issues filings are grouped to.
MATCH (c:Client {name:'Sargento Foods'})<-[ON_BEHALF_OF]-(f:Filing)-[ABOUT]->(i:Issue)
WHERE f.amount > 0
WITH f, i ORDER BY i.code
WITH f, COLLECT(DISTINCT i.code) AS iCodes
RETURN COUNT(f) AS cnt, SUM(f.amount) AS amount, iCodes
ORDER BY cnt DESC, iCodes
LIMIT 30
+-----------------------------------------------------------------------------------------------------------------+
| cnt | amount | iCodes |
+-----------------------------------------------------------------------------------------------------------------+
| 100 | 10444604 | ["HEALTH ISSUES"] |
| 60 | 10527280 | ["HEALTH ISSUES", "MEDICARE/MEDICAID"] |
| 48 | 6731321 | ["MEDICARE/MEDICAID"] |
| 42 | 2114054 | ["EDUCATION"] |
| 38 | 2091191 | ["TRANSPORTATION"] |
| 35 | 3885107 | ["FINANCIAL INSTITUTIONS/INVESTMENTS/SECURITIES"] |
| 31 | 1684250 | ["BUDGET/APPROPRIATIONS"] |
| 28 | 1993836 | ["DEFENSE"] |
| 27 | 2182940 | ["ENERGY/NUCLEAR"] |
| 26 | 4587100 | ["BANKING"] |
| 26 | 1572786 | ["TAXATION/INTERNAL REVENUE CODE"] |
| 26 | 1596781 | ["TRADE (DOMESTIC/FOREIGN)"] |
| 24 | 3202900 | ["AVIATION/AIRLINES/AIRPORTS"] |
| 24 | 5781000 | ["TELECOMMUNICATIONS"] |
| 22 | 2250310 | ["AGRICULTURE"] |
| 18 | 1382555 | ["BUDGET/APPROPRIATIONS", "HEALTH ISSUES"] |
| 16 | 512986 | ["GOVERNMENT ISSUES"] |
| 14 | 1253587 | ["COPYRIGHT/PATENT/TRADEMARK"] |
| 14 | 835000 | ["NATURAL RESOURCES"] |
| 13 | 334008 | ["ENVIRONMENT/SUPERFUND"] |
| 12 | 720000 | ["BUDGET/APPROPRIATIONS", "DEFENSE"] |
| 12 | 592500 | ["MARINE/MARITIME/BOATING/FISHERIES"] |
| 11 | 509384 | ["BUDGET/APPROPRIATIONS", "EDUCATION"] |
| 11 | 358060 | ["IMMIGRATION"] |
| 10 | 322795 | ["BUDGET/APPROPRIATIONS", "TAXATION/INTERNAL REVENUE CODE"] |
| 10 | 1038000 | ["ENERGY/NUCLEAR", "TAXATION/INTERNAL REVENUE CODE"] |
| 10 | 2887500 | ["FOREIGN RELATIONS"] |
| 10 | 357500 | ["VETERANS"] |
| 9 | 2705000 | ["BANKING", "FINANCIAL INSTITUTIONS/INVESTMENTS/SECURITIES", "TAXATION/INTERNAL REVENUE CODE"] |
| 9 | 310000 | ["BUDGET/APPROPRIATIONS", "FOREIGN RELATIONS"] |
+-----------------------------------------------------------------------------------------------------------------+
30 rows available after 67 ms, consumed after another 0 ms
neo4j>
Notice the difference? While the largest number of filings are about the issue Taxation/Internal Revenue Code, as a standalone issue, it ranks only #14. This shows that the filings are multi-issue where Taxation/Internal Revenue Code is just one of the issues assigned.
The top 3 issue groupings by number of filings include Health Issues and Medicare/Medicaid.
Issue and Government Entities
Now let's look at the combination of issues and government entities are related to filings. I'll skip the preliminaries of looking for single combinations of issue/government entities and instead immediately group in Cypher.
MATCH (c:Client {name:'Sargento Foods'})<-[:ON_BEHALF_OF]-(f:Filing)-[:TARGETED_AT]->(g:GovernmentEntity)
WHERE f.amount > 0 AND g.name <> 'SENATE' AND g.name <> 'HOUSE OF REPRESENTATIVES'
WITH c, f, g ORDER BY g.name
WITH c, f, COLLECT(DISTINCT g.name) AS gnames
MATCH (f)-[:ABOUT]->(i:Issue)
WITH f, gnames, i ORDER BY i.code
WITH f, gnames, COLLECT(DISTINCT i.code) AS icodes
RETURN COUNT(f) AS cnt, SUM(f.amount), icodes, gnames AS amount
ORDER BY cnt DESC, amount DESC, icodes, gnames
LIMIT 10
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| cnt | SUM(f.amount) | icodes | amount |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 13 | 2214023 | ["HEALTH ISSUES", "MEDICARE/MEDICAID"] | ["Centers For Medicare and Medicaid Services (CMS)"] |
| 10 | 969755 | ["HEALTH ISSUES"] | ["Centers For Medicare and Medicaid Services (CMS)", "Health & Human Services, Dept of (HHS)"] |
| 9 | 710000 | ["FINANCIAL INSTITUTIONS/INVESTMENTS/SECURITIES", "HOUSING", "INSURANCE", "TAXATION/INTERNAL REVENUE CODE"] | ["Federal Housing Finance Agency (FHFA)", "Housing & Urban Development, Dept of (HUD)", "Treasury, Dept of"] |
| 9 | 415287 | ["EDUCATION"] | ["Education, Dept of"] |
| 8 | 800000 | ["HEALTH ISSUES"] | ["Health & Human Services, Dept of (HHS)"] |
| 8 | 3162923 | ["MEDICARE/MEDICAID"] | ["Centers For Medicare and Medicaid Services (CMS)", "Health & Human Services, Dept of (HHS)"] |
| 7 | 1350000 | ["HEALTH ISSUES", "MEDICARE/MEDICAID"] | ["Centers For Medicare and Medicaid Services (CMS)", "Health & Human Services, Dept of (HHS)"] |
| 7 | 868000 | ["HEALTH ISSUES"] | ["Centers For Medicare and Medicaid Services (CMS)"] |
| 6 | 650000 | ["FINANCIAL INSTITUTIONS/INVESTMENTS/SECURITIES"] | ["Securities & Exchange Commission (SEC)"] |
| 6 | 284384 | ["BUDGET/APPROPRIATIONS", "EDUCATION"] | ["Education, Dept of"] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
It's difficult to format the output for the article, but Health Issues and Medicare/Medicaid dominate, as does the government entity Centers For Medicare and Medicaid Services (CMS).
Registrant
Though clients may directly lobby government entities, the majority contract other organizations to lobby on their behalf; these organizations are experienced lobbyists and often experts on specific issues in which the client is interested. In the world of lobbying parlance, these are Registrants.
Continuing with Sargento filings, the first question is what registrants lobbied on behalf of Sargento and whether any registrant predominates. There are actually two ways to look at this: by the number of filings and by the total dollar amount of the filings.
By Total Filings
MATCH (c:Client {name:'Sargento Foods'})<-[]-(f:Filing)
WITH c, f
MATCH (c)-[]->(r:Registrant)-[]->(f)
WHERE f.amount > 0
RETURN r.name, COUNT(f), SUM(f.amount)
ORDER BY COUNT(f) DESC
LIMIT 30
+---------------------------------------------------------------------------------------+
| r.name | COUNT(f) | SUM(f.amount) |
+---------------------------------------------------------------------------------------+
| "PotlatchDeltic Corporation" | 19 | 1530000 |
| "SAFER GVW, INC" | 10 | 807664 |
| "U.S. MORTGAGE INSURERS" | 9 | 710000 |
| "UCB, Inc." | 7 | 1420000 |
| "INTERNATIONAL CODE COUNCIL (ICC)" | 7 | 140000 |
| "NATIONAL HOSPICE AND PALLIATIVE CARE ORGANIZATION" | 5 | 730000 |
| "SPIE, THE INTERNATIONAL SOCIETY FOR OPTICS AND PHOTONICS" | 5 | 150000 |
| "National Trust for Historic Preservation" | 5 | 286674 |
| "American Academy of Neurology" | 5 | 1600000 |
| "Equinor US Operations LLC" | 4 | 1050000 |
| "Ecolab Inc." | 4 | 1040000 |
| "Navistar, Inc." | 3 | 630000 |
| "Armenian Assembly of America, Inc." | 3 | 70000 |
| "Financial Services Institute" | 3 | 360000 |
| "Interstate Natural Gas Association of America" | 3 | 615000 |
| "NATIONAL RECREATION AND PARK ASSOCIATION" | 3 | 52000 |
| "E.I. Du Pont de Nemours & Co." | 3 | 3310000 |
| "American Gaming Association" | 3 | 970000 |
| "North American Meat Institute (American Meat Institute)" | 3 | 197491 |
| "ADTALEM GLOBAL EDUCATION" | 3 | 540000 |
| "American Speech-Language-Hearing Association" | 3 | 580000 |
| "Wells Fargo & Company" | 3 | 3090000 |
| "KENTUCKY FARM BUREAU FEDERATION" | 3 | 368966 |
| "HUNTSMAN CORPORATION" | 3 | 1300000 |
| "UNIVERSITY OF MINNESOTA" | 3 | 106000 |
| "Oxfam America, Inc." | 3 | 333901 |
| "Abbott Laboratories" | 3 | 2480000 |
| "BorgWarner Inc." | 3 | 430000 |
| "Crowe LLP" | 3 | 90000 |
| "Mr. Richard Sawaya" | 2 | 75000 |
+---------------------------------------------------------------------------------------+
30 rows available after 30 ms, consumed after another 0 ms
neo4j>
By Total Amount of Filings
MATCH (c:Client {name:'Sargento Foods'})<-[]-(f:Filing)
WITH c, f
MATCH (c)-[]->(r:Registrant)-[]->(f)
WHERE f.amount > 0
RETURN r.name, COUNT(f), SUM(f.amount)
ORDER BY SUM(f.amount) DESC
LIMIT 30
+------------------------------------------------------------------------------------+
| r.name | COUNT(f) | SUM(f.amount) |
+------------------------------------------------------------------------------------+
| "Chamber of Commerce of the U.S.A." | 1 | 16460000 |
| "National Association of Realtors" | 1 | 11520000 |
| "Pharmaceutical Research and Manufacturers of America" | 1 | 9910000 |
| "Airbus Americas, Inc (FKA Airbus Group, Inc)" | 2 | 6930000 |
| "American Medical Association" | 1 | 6770000 |
| "PFIZER INC." | 2 | 6220000 |
| "U.S. Chamber Institute for Legal Reform" | 1 | 5630000 |
| "NORTHROP GRUMMAN CORPORATION" | 1 | 5580000 |
| "GENERAL DYNAMICS CORP" | 2 | 5340054 |
| "American Hospital Association" | 1 | 5310000 |
| "United Technologies Corporation" | 1 | 5050000 |
| "Koch Companies Public Sector, LLC" | 1 | 4620000 |
| "AMERICA'S HEALTH INSURANCE PLANS, INC. (AHIP)" | 2 | 4430000 |
| "National Association of Broadcasters" | 1 | 3930000 |
| "Amazon.com Services, Inc." | 1 | 3890000 |
| "General Motors Company" | 1 | 3840000 |
| "LOCKHEED MARTIN CORPORATION" | 1 | 3830000 |
| "Walmart Inc." | 2 | 3700000 |
| "INTERNATIONAL BUSINESS MACHINES CORPORATION (IBM)" | 2 | 3580000 |
| "The Business Roundtable, Inc." | 1 | 3560000 |
| "Comcast Corporation" | 1 | 3500000 |
| "Facebook, Inc." | 1 | 3400000 |
| "Google LLC" | 1 | 3360000 |
| "Boeing Company" | 1 | 3330000 |
| "E.I. Du Pont de Nemours & Co." | 3 | 3310000 |
| "NCTA - THE INTERNET & TELEVISION ASSOCIATION" | 1 | 3280000 |
| "Securities Industry and Financial Markets Association" | 2 | 3270000 |
| "Amgen, Inc." | 1 | 3250000 |
| "Cigna Corporation" | 2 | 3220000 |
| "Novartis" | 1 | 3200000 |
+------------------------------------------------------------------------------------+
30 rows available after 25 ms, consumed after another 0 ms
neo4j>
Interestingly, the 19 PotlatchDeltic Corporation totals a measly $1.5M, while the 1 Chamber of Commerce of the U.S.A. filing is for $16.5M. I'm not a lobbying expert and therefore can't explain the different ways a lobbying effort might work, but it's definitely something to explore.
Conclusion
This was never intended to be an exhaustive investigation into US federal lobbying, but rather an introduction into how Neo4J and Cypher might be used to learn more; any relevant conclusions would require a much larger data set.
Other possible areas to analysis might be:
- Are lobbyists — who are hired by registrants — focused on specific issues or government entities?
- Is there any correlation between registrants and the amount spent on lobbying with the number of issues they are competent in?
- Are there multiple clients all lobbying for specific issues? And do they use the same registrants to lobby? Is there a way to determine whether they are lobbying for the same or a different outcome?
Definitely a start, but much more work is possible, especially with larger data sets.
Other Notes
All queries were executed in the Cypher Shell, a utility that is included in the Neo4J install.
When collecting the names of government entities or the issue codes, the complete string can become rather long and the results displayed might be too long for display in the article; sorry.
Opinions expressed by DZone contributors are their own.
Comments