
Solar cell, telescope and other optical component manufacturers may be able to design better devices more quickly with artificial intelligence (AI).
Opto Generative Pretrained Transformer (OptoGPT), a decoder-only transformer, developed by University of Michigan (U-M) engineers, harnesses the computer architecture underpinning ChatGPT to work backward from desired optical properties to the material structure that can provide them.
The new algorithm designs optical multilayer film structures — stacked thin layers of different materials — that can serve a variety of purposes. Well-designed multilayer structures can maximize light absorption in a solar cell or optimize reflection in a telescope. They can improve semiconductor manufacturing with extreme UV light, and make buildings better at regulating heat with smart windows that become more transparent or more reflective depending on temperature.
OptoGPT produces designs for multilayer film structures within 0.1 seconds, almost instantaneously. In addition, OptoGPT’s designs contain six fewer layers on average compared to previous models, meaning its designs are easier to manufacture.
“Designing these structures usually requires extensive training and expertise such as identifying the best combination of materials, and the thickness of each layer, is not an easy task,” said L. Jay Guo, U-M Professor of Electrical and Computer Engineering and corresponding author of the study published in Opto-Electronic Advances.
For someone new to the field, it’s difficult to know where to start. To automate the design process for optical structures, the research team tailored a transformer architecture — the machine learning framework used in large language models like OpenAI’s ChatGPT and Google’s Bard — for their own purposes.
“In a sense, we created artificial sentences to fit the existing model structure,” Guo said.
The model treats materials at a certain thickness as words, also encoding their associated optical properties as inputs. Seeking out correlations between these “words,” the model predicts the next word to create a “phrase”— in this case a design for an optical multilayer film structure — that achieves the desired property such as high reflection.
Researchers tested the new model’s performance using a validation dataset containing 1,000 known design structures including their material composition, thickness and optical properties. When comparing OptoGPT’s designs to the validation set, the difference between the two was only 2.58 percent, lower than the closest optical properties in the training dataset at 2.96 percent.
Similar to how large language models are able to respond to any text-based question, OptoGPT is trained on a large amount of data and able to respond well to general optical design tasks across the field.
If researchers are focused on a task, like designing a high-efficiency coating for radiative cooling, they can use local optimization — adjusting variables within bounds to achieve the best possible outcome — to further fine-tune the thickness to improve accuracy. During testing, the researchers found fine-tuning improves accuracy by 24 percent, reducing the difference between the validation dataset and OptoGPT responses to 1.92 percent.
Taking analysis a step further, the researchers used a statistical technique to map out associations that OptoGPT makes.
“The high-dimensional data structure of neural networks is a hidden space, too abstract to understand.
We tried to poke a hole in the black box to see what was going on,” Guo said.
When mapped in a 2D space, materials cluster by type such as metals and dielectric materials, which are electrically insulating but can support an internal electric field. All dielectrics, including semiconductors, converge upon a central point as the thickness approaches 10 nanometers. From an optics perspective, the pattern makes sense as light behaves similarly regardless of material as they approach such small thicknesses, helping further validate OptoGPT’s accuracy.
Known as an inverse design algorithm because it starts with the desired effect and works backward to a material design, OptoGPT offers more flexibility than previous inverse design algorithm approaches, which were developed for specific tasks. It enables researchers and engineers to design optical multilayer film structures for a wide breadth of applications.
Discussion and Conclusion
Optical multilayer thin film structures have been widely used in numerous photonic applications. However, existing inverse design methods have many drawbacks because they either fail to quickly adapt to different design targets, or are difficult to suit for different types of structures, e.g., designing for different materials at each layer. These methods also cannot accommodate versatile design situations under different angles and polarizations. In addition, how to benefit practical fabrications and manufacturing has not been extensively considered yet.
By converting the multilayer structure into a sequence using structure tokens and structure serialization, we propose OptoGPT to effectively deal with the non-trivial inverse design problem in multilayer structure. Combined with many proposed techniques, our model can unify the inverse design under different types of input targets under different incident angle/ polarization, be versatile to different types of structures, as well as facilitating the fabrication process by providing the diversity and flexibility. We hope the development of OptoGPT will make the multilayer thin film structure-based inverse design effective in methodology and easily accessible to researchers and engineers.
The interesting findings of the hidden representations of OptoGPT suggest that it has acquired domain-specific knowledge pertaining to optical multilayer structures through the training process. Furthermore, the model has demonstrated the capacity to apply this acquired knowledge effectively in the inverse design process. However, the current framework still lacks explain ability and does not allow users to directly understand the physical principles involved in its designs. For example, is there a general principle for designing absorbers and DBR? How to design high saturation structural color step-by-step? We hope future work can find a way to extract and formulate these design principles from the model and apply them to guide inverse design.
In addition, using similar methods, our model can be expanded towards high-dimension complicated photonic structures, e.g., 2D metasurfaces or 3D waveguides, using similar tokenization method in Vision Transformer47. However, one limitation is that our model requires a large dataset for training, which is also a common criticism for many GPT models. For example, ChatGPT is trained on billions of tokens using ~10,000 GPUs, costing ~$10 million for a single training. In this work, because of the constraint on computation resources, we need to simplify our design problems, including using limited types of materials, limited spectrum range, thickness discretization as well as the maximum number of layers that can be designed, all of which can be extended with more computation resources.
Despite using a large-scale dataset with 10 million samples for training, it is important to recognize that this dataset only covers a small fraction (~10−52) of the expansive and complex design space associated with optical multilayer thin film structures. Due to this limitation of its training dataset, OptoGPT may fail to find a design that lies outside the boundaries of the sampled design space. Close collaboration across multiple research groups is needed to obtain a better model for a more general and better photonic inverse design that expands to more complicated structures.
This article was written by Patsy DeLacey, Research Communications Specialist, University of Michigan, College of Engineering (Ann Arbor, MI), with material from the study written by a team of researchers from the University of Michigan. It has been edited. For more information, contact Kate McAlpine at

