Description & Requirements
Each day U.S. Customs and Border Protection (CBP) oversees the massive flow of people, capital, and products that enter and depart the United States via air, land, sea, and cyberspace. The volume and complexity of both physical and virtual border crossings require the application of “big data” solutions to promote efficient trade and travel. Further, effective “big data” solutions help CBP ensure the movement of people, capital, and products is legitimate, safe, and secure.
Responsibilities include but are not limited to:
Perform hands-on analysis and modeling with large, complex data sets to provide solutions to support a wide range of law enforcement mission areas.
Demonstrate proficiency in extracting, cleaning, and transforming CBP transactional and mission data associated within an identified problem space to build predictive models as well as develop appropriate supporting documentation.
Leverage knowledge of a variety of statistical and machine learning techniques and methods to define and develop programming algorithms; train, evaluate, and deploy predictive analytics models that directly inform mission decisions.
Execute projects including those intended to identify patterns and/or anomalies in large datasets; perform automated text/data classification and categorization as well as entity recognition, resolution, and extraction; and named entity matching.
Minimum Qualifications:
Experience in developing machine learning models and applying advanced analytics solutions to solve complex business problems.
Experience with programming languages including R, Python, Scala, Java.
Experience with SQL programming.
Experience constructing and executing queries to extract data in support of EDA and model development.
Experience with pattern recognition and extraction, automated classification, and categorization.
Experience with entity resolution (e.g., record linking, named entity matching, deduplication/ disambiguation)
Bachelor’s Degree (required) in operations research, industrial engineering, mathematics, statistics, computer science/engineering, or other related technical fields with equivalent practical experience.
A high school diploma and 6 years of experience, an associate degree and 4 years of experience, a bachelor’s degree and 2 years of experience, a master’s degree and 0-5 years of experience or a PhD and 0-3 years of experience is required.
Preferred Qualifications:
Proficiency with statistical software packages including: SAS, SPSS Modeler, R, WEKA, or equivalent
Proficiency with Unsupervised Machine Learning methods including Cluster Analysis (e.g., K-means, K-nearest Neighbor, Hierarchical, Deep Belief Networks, Principal Component Analysis), Segmentation, etc.
Proficiency with Supervised Machine Learning methods including Decision Trees, Support Vector Machines, Logistic Regression, Random/Rotation Forests, Categorization/Classification, Neural Nets, Bayesian Networks, etc.
Experience with pattern recognition and extraction, automated classification, and categorization.
Experience with entity resolution (e.g., record linking, named entity matching, deduplication/ disambiguation)
Experience with visualization tools and techniques (e.g., Periscope, Business Objects, D3, ggplot, Tableau, SAS Visual Analytics, PowerBI)
Experience with big data technologies (e.g., Hadoop, HIVE, HDFS, HBase, MapReduce, Spark, Kafka, Sqoop)
Master’s degree in mathematics, statistics, computer science/engineering, or other related technical fields with equivalent practical experience
Clearance Requirements:
Selected applicants must be a US Citizen and able to obtain and maintain a U.S. Customs and Border Protection (CBP) suitability.
Physical Requirements:
The person in this position needs to occasionally move about inside the office to access file cabinets, office machinery, or to communicate with co-workers, management, and customers, which may involve delivering presentations.