In the standard Romance languages, auxiliary selection in the periphrastic tenses follows clear rules. Italian and French display a two-auxiliary system, where have is used with transitives and unergatives, and be with unaccusatives and reflexives. Spanish, by contrast, has generalised have across the entire verb system. In many Italo-Romance dialects, however, these regularities break down: the two auxiliaries alternate within the same tense-aspect-mood paradigm, giving rise to the phenomenon known as mixed paradigms. Across a wide area of the Italian peninsula, speakers use be and have in unexpected combinations. For instance, one dialect may use be only in the first person singular and have elsewhere, while another follows a different distribution. In many varieties, the pattern even changes from one tense to another – for example, a verb may alternate between auxiliaries in the present perfect but generalise have in the pluperfect. This complex system disrupts both the intraparadigmatic and interparadigmatic uniformity that characterises the standard languages. Mixed paradigms can involve different verb types — transitives, unaccusatives, or reflexives — and occasionally only one of these classes. Additional phenomena, such as agreement (marking gender and number on past participles) and Raddoppiamento Fonosintattico (consonant lengthening), also interact with auxiliary choice, further enriching the system’s structural complexity.
The MIXPAR database
To document and analyse these patterns, the MIXPAR project has created a standardised, reusable database designed for cross-dialectal comparison and statistical analysis. Each entry corresponds to one inflected form, annotated with linguistic and geographic information. The database is currently available in several downloadable formats — a full Excel version and two CSV files (one of which is tailored to the analysis of whole paradigmatic patterns).
These resources can be accessed under the section Dataset.
There is also an interactive online version of the database, currently under development. (Any comments, suggestions, or notices of errors are very welcome.)
Core variables include:
- Form – the inflected form as recorded in the source
- Cell – person–number combination (1SG–3PL)
- Verb_Construction – verb or complex predicate
- Class – transitive, unergative, unaccusative, or reflexive subtype
- TAM – tense–aspect–mood (e.g. PRF, PLPF)
- AUX – selected auxiliary (E, H, E≈H, H≈E, or F)
- AGREE – presence or absence of overt agreement
- RF – presence of Raddoppiamento Fonosintattico
- Place, Province, Region – geographic metadata
- Latitude/Longitude – GPS coordinates
- Dialect classification – based on Pellegrini (1977)
- Source – bibliographic reference and notes
This structure makes it possible to capture both systematic and irregular behaviour across nearly 200 locations. By combining morphological, dialectological, and quantitative approaches, the MIXPAR database provides a unique empirical foundation for investigating the diversity of auxiliary systems in Italo-Romance.
