Attributes	Descriptions
Data_id_num	A number that the same as the name of the CV file and uniquely identifies each row of data.
Blind_resume	The name of the CV file that has been anonymized
Position	The target position/title required by the relevant consulting service or project
Project_and_role_detail_description	A description of the expected expertise, competencies, past experience, and project-specific requirements for the position.
Client_name	The anonymized name of the client
Client_Industry	The client's industry
Label	A binary label that indicates the accuracy of the match.
Technologies	Technologies in which the candidate has expertise and actively uses in their projects.
Education	The university and department the candidate graduated from.
Experience_work_experience	The candidate's previous companies, positions, work dates, and key responsibilities.
Projects	The projects in which the candidate has participated.
Similarity	The semantic similarity score between the Position + Project and Role Detail Description and the CV text (float, between 0 and 1).
CV_word_count	Total word count in the CV text.
Common_kw_count	Total number of words in the position + role text used in the CV.
Position_kw_in_cv	Total number of times the words in the position name appear in the CV.

Table 1: Attributes and their descriptions

Model	Hyperparameter Range
SVM	“C_Values”: [0.1 - 100]
LightGBM	“Verbosity”: [-1],
	“Seed”: [42],
	“Num_Boost_Round”: [100],
	“Num_Leaves”: [15, 31, 63],
	“Min_Child_Samples”: [5, 10, 20],
	“Max_Depth”: [-1, 5, 10],
	“Learning_Rate”: [0.01, 0.05, 0.1],
	“N_Estimators”: [50, 100, 200],
	“Min_Split_Gain”: [0, 0.01, 0.1]
LR	“Random_State”: [42],
	“Max_Iter”:[100-1000],
	“C”: [0.01, 0- 100]
ELM	“N_Hidden”: [50]
	“Random_State”: [42]
Voting	LR (“C”: [10],
	“Max_Iter”: [1000], “Random_State”:[42])
	SVM(“C”:1.0, “Random_State”: [42])
	LightGBM (“Learning_Rate”: [0.01], “Max_Depth”: [-1], “Min_Child_Samples”: [20], “Min_Split_Gain”: [0], “N_Estimators”: [100], “Num_Leaves”: [15], “Random_State”: [42])
Bagging	LR (“N_Estimators”: [50])
	SVM and LightGBM (“N_Estimators”: [20],
	“Random_State”: [42], “N_Jobs”: [-1])
Boosting	LR (“C”: [10], “Max_Iter”:[1000], “Random_State”: [42]), “N_Estimators” : [20], “Learning_Rate”: [1.0])
	LightGBM (“Learning_Rate” : [0.01], “Max_Depth” : [-1], “Min_Child_Samples”: [20], “Min_Split_Gain”: [0], “N_Estimators”: 100, “Num_Leaves”: [15], “Random_State”: [42])
Stacking	LR (“C”: [10], “Max_Iter”: [1000])
	SVM (“C”: [1.0])
	LightGBM (“Learning_Rate”: [0.01], “Max_Depth”: [-1], “Min_Child_Samples”: [20], “Min_Split_Gain”: [0], “N_Estimators”: [100], “Num_Leaves”: [15])

Table 2: Model hyperparameter ranges

Table 3: The results obtained with the developed models

Table 1

Table 2

Table 3