Numerical weather prediction and climate models continue to have large errors for stable boundary layers (SBL). To understand and to improve on this, so far three atmospheric boundary layer model inter-comparison studies have been organised within the Global Energy and Water Cycle Experiment (GEWEX) of the World Climate Research Programme (WCRP). The previous GEWEX ABL Studies (GABLS) have joined about 20 research groups to model the SBL (GABLS1), the diurnal cycle (GABLS2, GABLS3) and the nocturnal low-level jet (GABLS3). With GABLS4 we aim to increase the further understanding of performance and challenges of numerical models in very stable conditions and contribute to the development of parameterization schemes. In this study we explore the set-up of GABLS4 over the Brunt Ice Shelf, Antarctica, where the British Antarctic Survey carries out measurements at the Halley station, including flux and profile measurements from a 32-m-high instrumented mast as well as tethersonde and rawinsonde soundings. The preparatory work towards GABLS4 includes mesoscale model experiments for selected periods (pre-GABLS4). One of these periods is May 2003, when a very stable boundary layer developed at Halley. On 18th of May, the 1-m wind speed calmed down to less than 1 m/s, the 1-m air temperature dropped to -35°C, and a 15°C inversion was generated in the 31 m layer observed by the mast instruments. The observed air temperature and wind included many oscillations with typical periods of 0.5 to 2 h which introduce further challenges. The 18th of May case was simulated applying four mesoscale models: The Polar WRF, HIRLAM, AROME, and the Unified Model (UM). All models took the initial and boundary conditions from ECMWF analyses and applied two or three nested domains, with a 2.5 to 4 km horizontal resolution in the finest domain covering the Brunt Ice Shelf, mostly ice-covered ocean, and parts of the sloping ice sheet. The vertical resolution and physical parameterization schemes for ABL turbulence, radiation, clouds, and heat conduction in the snow varied between the models. The model results were validated against the observations paying attention, among others, to the decoupling of the snow surface and the SBL, and the surface energy balance terms. The modelling challenges include the heat conduction in the snow and the decrease of the downward sensible heat flux with increasing air-surface temperature difference. The pre-GABLS4 mesoscale experiments provided information that is essential for the selection of the GABLS4 case, to be addressed by single column (SCM) and LES models. The pressure gradient was weak during the study period, with mostly ageostrophic winds at Halley. Also the lateral heat advection was weak, and the mesoscale model results can probably be applied to prescribe it for the SCM experiments. The conditions favour the set-up for a GABLS4 inter-comparison case for SCM's but this needs further discussion.