Multi-parent order crossover mechanism of genetic algorithm for minimizing violation of soft constraint on course timetabling problem

Article history: Received 26 July 2019 Revised 5 August 2019 Accepted 6 August 2019 Published 3 April 2020 A crossover operator is one of the critical procedures in genetic algorithms. It creates a new chromosome from the mating result to an extensive search space. In the course timetabling problem, the quality of the solution is evaluated based on the hard and soft constraints. The hard constraints need to be satisfied without violation while the soft constraints allow violation. In this research, a multi-parent crossover mechanism is used to modify the classical crossover and minimize the violation of soft constraints, in order to produce the right solution. Multi-parent order crossover mechanism tends to produce better chromosome and also prevent the genetic algorithm from being trapped in a local optimum. The experiment with 21 datasets shows that the multi-parent order crossover mechanism provides a better performance and fitness value than the classical with a zero fitness value or no violation occurred. It is noteworthy that the proposed method is effective to produce available course timetabling.


Introduction
Genetic algorithm is an optimized algorithm that adopts the process of natural selection to obtain the best collection of individuals in a population. It is a metaheuristic algorithm often used in optimizing various studies such as in creating timetables for courses in colleges. There are many kinds such as ant colony and swarm, however, the genetic is the best as it contains least risk of violations to the constraint [1].
Adaptive Genetic Algorithm is used to improve the effectiveness in solving problems associated with automated examination timetable in universities [2]. It was analysed using graph colouring, which proved its efficiency in solving problems associated with course timetabling [3]. However, the possibility of premature convergence and stagnation tended to occur while determining solutions [4]. Therefore, a study to develop the right genetic algorithms is still on-going, especially regarding timetabling courses.
Course timetabling does not only involve courses but also instructors in a particular space and time scheduling to fulfil several requirements or constraints, which is categorized into hard and soft. The hard constraints are not allowed to violate any rule, while the soft are not strictly prohibited, but preferably fulfilled, therefore, the less violated it is, the better the schedule generated. Course timetabling problems are NP-hard since it is not easy to obtain optimal solutions for large algorithms such as scheduling because they have to meet the constraints [4]. Genetic algorithm (GA) consists of several operators that are important in obtaining good solutions, namely selection, crossover and mutation. From the technical point of view, the three operators are the basic development in order to produce the desired performance [5]. The crossover operator plays an important role in GA and is dubbed as its backbone, therefore, lots of study is still needed for its development [6]. The GA modification in the form of Directed-Based Crossover (DBX) is conducted to obtain a better population in each generation [7]. Study on its comparison such as Order Crossover, Partially Mapped Crossover and Cycle Crossover has been carried out, resulting order crossover to be the most effective algorithm in producing feasible scheduling solutions [8]. Modifications of OX in recursive GA have also been conducted to produce decently scheduled timetabling courses [9]. PMX is modified using a feasibility check during gene exchange based on the period to obtain a good offspring [10]. The feasibility check carried out during the exchange process generates a much faster and better performance than conducting it after the exchange process. Meanwhile, to increase diversity in GA, several studies on Multi-parent Crossover have been conducted producing solutions which prohibit similarity among them [11,12]. The BLX-α multi-parent crossover was also developed to take advantage and enhance the exploration process in GA into a larger search space [13].
The modification of GA is conducted to avoid the possibility of premature convergence by increasing exploration in the solution search space. The use of multi-parent crossover makes chromosomes diverse and different from one another with an exploration increase. In the case of timetabling courses with soft constraints a solution with a small level of violation is very crucial. Therefore, this study aims to develop GA specifically for crossover operators using a multi-parent order mechanism. Its development is expected to produce high diversity that can explore solution space potency and reduce the level of violation on soft constraints.

Research Method
Courses Timetabling is a routine activity which occurs in each semester in universities. The problem associated with it includes NP-Hard class which is very difficult to solve using a classical algorithm [14]. A number of Events (courses, lecturers) needs to be allocated to a limited space and time to fulfill various  [15]. A set number of constraints need to be ensured in solving this timetabling problem [16]. In this study, the ITC-2007 dataset with curriculum-based course timetabling (CB-CTT) type is used to describe the number of constraints.
The hard constraint in the dataset consists of: H.1. Lectures: All lectures need to be scheduled in the available timeslot (day, period). H.2. Room occupancy: Two lectures may not be placed in the same room and period. H.3. Conflicts: Courses in one curriculum group or taught by the same lecturer need to be scheduled at different periods. H.4. Availability: Courses should have a specific period that cannot be placed in another. While soft constraints consist of.
S.1. Room Capacity: The number of students taking the course need to be less or equal to the room capacities. Every student without a seat is penalized. S.2. Room stability: All lectures in one course should be placed in the same room. Those with different rooms should be penalized. S.3. Minimum Working Days: Lectures from each course need to be distributed to the assigned minimum number of days. Every shortfall is subjected to a penalty of five points. S.4. Curriculum Compactness: Lectures in one curriculum should corporate. Those that fail, are subject to a penalty of two points.

Research Methodology
In this research, GA is used to solve problems associated with course timetabling. Genetic Algorithm is an excellent search algorithm for solving problems based on the principles of Darwinian evolutionary biology, and is currently used in various studies, including scheduling [17,18]. Some operators are part of the main contributors to its addition to the multi-parent order crossover mechanism which comprises of random initial population mechanism, exchange mutation mechanism, and improvement function.
The system design is shown in Figure 1.

Dataset
In this research, the dataset was obtained from the International Timetabling Competition (ITC-2007), and it consists of several problems. This study uses CB-CTT data type which comprises of 21 dataset files, with the aim of reducing violations of soft constraints to the minimum and preventing the hard constraints. CB-CTT data has several attributes, including: 1. Courses: Each course consists of the number of lectures, and students set to be scheduled in a certain period. 2. Rooms: Each room has a chair capacity. 3. Days: Number of days available. 4. Period Per Day: Number of periods in one day. 5. Curricula: A group of subjects. 6. Min-Working-Days: The minimum number of days in a course that need to be available. Figure 2. Chromosome representation Chromosome encoding is conducted to determine the success of the process of GA before an initial population is carried out. These effects are not only based on the efficiency, but also on the speed and quality of getting the results of a solution to the problem.

Chromosome representation
In this research, chromosome represents course, day, period, and room. Its length depends on the number of courses, because the greater it is, the bigger the length. For further explanation, see Figure 2.

Fitness function
The fitness function is used as a parameter to candidates irrespective of the problem [19]. It also depends on the specifications of the problem that needs to be solved. In terms of scheduling, the quality of a chromosome is measured by the number of constraints violated. Each chromosome is checked for validity by analysing the hard and soft constraints. In hard constraints, all cases involving no violations, while in soft constraint, it may occur, but should be as small as possible. Every violated soft constraint is given a penalty value.

Rank selection
Rank Selection (RS) is a technique that uses explorative concepts to prevent convergence. It sorts the population based on fitness value and gives it a rank, with the highest first and the smallest last. In this research, three parents would be chosen and used in the crossover process. Two would be selected from the RS while another based on the chromosome with the greatest fitness. This is conducted to maintain the diversity of the population and explore a wider solution space. The wider the search space, the easier it is for GA to obtain the optimum global solution.

Multi-parent order crossover mechanism
The existence of additional parent in the crossover process reduces the pressure in the selection process and increases the diversity of the population. The higher the level of diversity the smaller the occurrence of incest (mating between similar genes). This multi-parent order crossover mechanism consists of three parent resulted from the previous selection process. Multi-parent OX mechanism works by choosing the gene sequence on the best chromosome in a random position to be filled in a new offspring.
An additional chromosome is then inserted into one parent since it enhances the exploration process into a wider search space [13]. Next, the empty block of offspring on the left side is sequentially filled with the first chromosome genes RS chose. To fill in the empty blocks on the right, genes from the second choice chromosome and from the RS is utilized. Assuming the exchange process is restricted by a violation on the hard constraint, the repair mechanism is carried out. An illustration of the Multiparent OX mechanism operation is seen in Figure 3.

Note:
∶ Chromosome with lower fitness value 1 ∶ First chromosome from rank selection ∶ ∶ Figure 3. Multi-parent order crossover mechanism process

Exchange mutation mechanism
The existence of mutation helps to improve the quality of the offspring previously formed. The role of mutation is to make small changes to the chromosomes to meet the predetermined constraints. There are several types of mutations namely exchange (EX), insertion (INS), and swap (SWP). The results of studies on the mutations showed that EX was the best mutation operator compared to INS and SWP [20]. The exchange (EX) mutation operation was used due to its effectiveness which allows good results. EX works by selecting and exchanging the position of two pairs of genes during which a violation on the hard constraint, leads to execution of the repair mechanism. An illustration of the operation (EX) mechanism is seen in Figure 4.  When the chosen course and number of student is greater than the room capacity, move it randomly. 2 When the chosen course is a specific lecture in different rooms move to the same room with day and room available 3 When the chosen course are not distinct in the same day move it to the same curriculum with others.

Improvement function
Offspring may not be optimal, therefore at this stage, the improvement function process is inspired by the nieghborhood function carried out by previous study [21] Repair function is performed on an infeasible chromosome to hasten the resulting solution thereby, making it feasible, by minimizing the constraint. The improvement function in this study consists of three stages, as follows 1 , 2 , 3 .
More details of improvement function is seen in Table 1.

Worse replacement
After passing through a mutation process, the offspring formed is included in the population. One of the replacement operations is the Worst Replacement (WR) operation and it has been applied in several studies, especially in scheduling problems [14]. It flows when the offspring is formed, by replacing the individuals in the population with high violation or fitness value. This operation produces the best individuals collection in the population.

Result and Discussion
Tests in this study are divided into two scenarios. The first comprises of classical GA with the proposed scenario from other literatures.  Figure 5 shows the flow process of changing the fitness value from each generation using classical GA, using sample from the 2nd dataset. Two algorithms tend to experience a change in fitness value because they pass through the rank selection process which select a new parent, using a multi-parent order crossover mechanism and mutated through exchange. The result of the mutation was improved using the improvement function, while the offspring replaces the worst chromosomes in the population. The process stopped when the fitness value reached 0, also known as its optimal global or generates 3000.

Comparison of fitness value between classical GA and proposed GA
As seen in the performance graph in Figure 5, the fitness value in the proposed GA has decreased more than the classical. In the generation between 300 and 3000 for classical GA the value of fitness only decreased by 150 while in the proposed method the value decreased by 80. In the classical generation between 2500 and 3000, they were stagnant or trapped at the local optimum, however, in the proposed method it was still going down. This indicates that the use of multi-parent order crossover mechanism increased exploration potency in the search space, with a higher possibility of obtaining optimal solutions. Table 2 presents the number of violations of soft constraints in classical GA testing and the proposed method. Each algorithm obtains a fitness value based on the type of soft constraint violated. In the 11th dataset, each algorithm which was able to reach the global optimum successfully satisfied soft constraints with no violations or, in other words, had a fitness value of 0. In the 12th dataset, both algorithms consisted of the highest fitness value due to high violations which occured in the S.4 constraint with a fitness value of 291 for the classical GA and 254 for the proposed. In all datasets tested, the proposed GA had better performance than the class due to its smaller fitness value.  Table 3 presents the results of testing the classical and proposed GA method based on the average fitness value carried out 30 times on each dataset. The table also presents the best and average values for each algorithm which is tested 30 times. It is seen in Table 3 that the best value in the proposed method produces a smaller fitness value than classical GA in all datasets. This indicated that the probability of getting an optimal solution on the use of the proposed method is higher. It was also concluded that the use of multi-parent in crossover increases exploration potency in the search space and provided effective results to reduce violations of soft constraints.

Comparison of GA method from other literature with proposed GA.
At this stage, the fitness value between the proposed GA method and other literature is performed [10]. The literature presents the results of the algorithm using a hybrid GA to solve course scheduling problems. The method proposed with other literature used the same dataset, ITC-2007 with CB-CTT track type consisting of 21 dataset which was tested for 30 times. Table 4 shows the performance of each algorithm based on violations of soft constraints. In some datasets the proposed GA method had a smaller fitness value than other literature. In the 11th dataset each algorithm obtained an average fitness value of 0, which meant that the solution had acquired the global optimum without any violations. In the 12th dataset, GA from Akkan & Gülcü was 367.2 while