23 Finding dual-degree holders

# i always begin by loading the disco engine if it isn't already loaded
library(discoveryengine)

We recently received a request to take a look at Haas MBAs who have an additional degree from Berkeley, beyond their MBA. This was a tricky one. If we wanted to find people who have a joint MBA/MPH, we could just do:

has_degree(MBA) %and% has_degree(MPH)

But for this request, we want to find people with an MBA plus any other degree. We want to write something like:

has_degree(MBA) %and% 
    has_degree(any_other_degree_besides_mba)

How can we do that? Enter the not operator. Inside a widget, the not() operator excludes, rather than includes, the codes entered. So:

mba_dual_alum = has_degree(MBA) %and% 
    has_degree(not(MBA))

display(mba_dual_alum)
## # A tibble: 3,640 x 1
##    entity_id
##        <dbl>
##  1       137
##  2       159
##  3       217
##  4       235
##  5       252
##  6       257
##  7       311
##  8       340
##  9       358
## 10       513
## # … with 3,630 more rows

23.1 Why not %but_not%?

You might wonder, why do we need not() when we can combine widgets with %but_not%? Our example will help illustrate the difference. This –

has_degree(MBA) %but_not% has_degree(MBA)

– looks for anyone who both has an MBA degree and also does not have an MBA degree. Which is impossible. What we did, on the other hand –

has_degree(MBA) %and% has_degree(not(MBA))

– looks for anyone who has an MBA degree as well as a non-MBA degree. And as we saw, there are thousands of such individuals.